Handle votes with joint responsible committees #1122

tillprochaska · 2025-03-16T14:16:45Z

A procedure can have multiple responsible committees (see Rules of Procedure and an example procedure).

I’ve updated the procedure scraper to properly handle procedures with a joint committee responsible and added tests.
I’ve updated the CSV export and removed the responsible_committee_code in the votes.csv table. Instead, there are now separate tables committees.csv and responsible_committee_votes.csv.

After deployment, we need to re-scrape the procedure files and then re-aggregate the votes.

linusha

More of a conceptual objection: I think extracting the responsible committees from the votes.csv severely limits the usefulness of that file and makes finding stuff on specific topics (or maybe better: limiting the list of votes to sensible candidates) much harder.
I know that one can just join the two tables, but I am thinking about a quick Excel-thing here.

There are multiple ways to handle this imo:

One can say does not matter and just do it anyways, and rethink this once negative feedback comes in. I think relying on this is not a good solution, in this case.
I would need to spend some more time on reading the procedural rules, but as long as there is a hard cap on the number of responsible committees, we could just include several columns in the votes.csv, most of them empty.
I even would be okay with a solution that just has one columns and string-concatenates them ("LIBE, IMCO", as an example).

Side-note: As its cheap, should we also collect the Committees for opinion? Probably should not even be in this PR and here I would also be totally fine with extracting that in a separate table in any case, possibly also duplicating information in the case we decide to leave the responsible committees in the votes.csv. We have duplicated information in the export anyways. Just wanted to raise the matter here as it seemed fitting.

backend/howtheyvote/export/__init__.py

See https://www.europarl.europa.eu/doceo/document/RULES-9-2022-07-11-RULE-058_EN.html

linusha

I still think the way we separate the tables here now at least point towards an inconsistency in our mental-model of possible target audiences for the export.

I don't think spending time or energy on resolving this makes sense. Can be merged.

linusha · 2025-04-26T20:23:56Z

backend/howtheyvote/export/__init__.py

+        with self.committees.open() as committees:
+            exp = func.json_each(Vote.responsible_committees).table_valued("value")
+            query = (
+                select(func.distinct(exp.c.value)).select_from(Vote, exp).order_by(exp.c.value)
+            )
+            committee_codes = Session.execute(query).scalars()
+
+            for committee_code in committee_codes:
+                committee = Committee[committee_code] if True else None
+                committees.write_row(
+                    {
+                        "code": committee.code,
+                        "label": committee.label,
+                        "abbreviation": committee.abbreviation,
+                    }
+                )


Just for my understanding - this means that the committee table will not be guaranteed to contain all committes, but all committees that were responsible for a vote that we collected, correct?

Yes, exactly. The list of committees we obtain from the Publications Office contains every committee that has ever exists (some of them which have existed for only a very short amount of time decades ago). This is similar to what we do for e.g. EuroVoc concepts as well, we only export what’s actually relevant to the dataset.

tillprochaska force-pushed the multiple-responsible-committees branch 2 times, most recently from 6f66b36 to 57f9698 Compare March 16, 2025 18:05

tillprochaska marked this pull request as ready for review March 16, 2025 18:05

tillprochaska requested a review from linusha March 16, 2025 18:05

linusha reviewed Mar 18, 2025

View reviewed changes

tillprochaska commented Apr 23, 2025

View reviewed changes

backend/howtheyvote/export/__init__.py Outdated Show resolved Hide resolved

tillprochaska commented Apr 23, 2025

View reviewed changes

backend/howtheyvote/export/__init__.py Show resolved Hide resolved

tillprochaska force-pushed the multiple-responsible-committees branch 5 times, most recently from faa2cf2 to 7eae4cf Compare April 23, 2025 15:15

Handle joint responsible committee

e1faf83

See https://www.europarl.europa.eu/doceo/document/RULES-9-2022-07-11-RULE-058_EN.html

tillprochaska force-pushed the multiple-responsible-committees branch from 7eae4cf to e1faf83 Compare April 23, 2025 15:18

linusha approved these changes Apr 26, 2025

View reviewed changes

tillprochaska merged commit a9b8057 into main May 11, 2025
4 checks passed

tillprochaska deleted the multiple-responsible-committees branch May 11, 2025 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle votes with joint responsible committees #1122

Handle votes with joint responsible committees #1122

Uh oh!

tillprochaska commented Mar 16, 2025 •

edited

Loading

Uh oh!

linusha left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

linusha left a comment

Uh oh!

linusha Apr 26, 2025

Uh oh!

tillprochaska May 11, 2025

Uh oh!

Uh oh!

Uh oh!

Handle votes with joint responsible committees #1122

Handle votes with joint responsible committees #1122

Uh oh!

Conversation

tillprochaska commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linusha left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

linusha left a comment

Choose a reason for hiding this comment

Uh oh!

linusha Apr 26, 2025

Choose a reason for hiding this comment

Uh oh!

tillprochaska May 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tillprochaska commented Mar 16, 2025 •

edited

Loading

linusha left a comment •

edited

Loading