Tricoteuses-Assemblee
Retrieve, clean up & handle French Assemblée nationale's open data
Retrieval of open data (in JSON format) from Assemblée nationale's website
mkdir ../assemblee-data/
npx babel-node --extensions ".ts" -- src/scripts/retrieve_open_data.ts --fetch ../assemblee-data/
Reorganizating open data files and directories into cleaner (and split) directories
npx babel-node --extensions ".ts" -- src/scripts/reorganize_data.ts --no-validate-raw ../assemblee-data/
Note: These reorganized files are also available in Tricoteuses / Data / Données brutes de l'Assemblée. They are updated on a regular basis.
Validation & cleaning of JSON data
npx babel-node --extensions ".ts" -- src/scripts/clean_reorganized_data.ts ../assemblee-data/
Note: These split & cleaned files are also available in Tricoteuses / Data / Données nettoyées de l'Assemblée with the
_nettoye
suffix. They are updated on a regular basis.Retrieval of députés' pictures from Assemblée nationale's website
npx babel-node --extensions ".ts" -- src/scripts/retrieve_deputes_photos.ts --fetch ../assemblee-data/
Retrieval of sénateurs' pictures from Assemblée nationale's website
npx babel-node --extensions ".ts" -- src/scripts/retrieve_senateurs_photos.ts --fetch ../assemblee-data/
Retrieval of pending amendments from Assemblée nationale's website
(Pending amendments are amendments waiting to be processed by Assemblée services.)npx babel-node --extensions ".ts" -- src/scripts/retrieve_pending_amendments.ts --incremental ../assemblee-data/
Retrieval of documents from Assemblée nationale's website
npx babel-node --extensions ".ts" -- src/scripts/retrieve_documents.ts --textes ../data/assemblee-textes ../data/assemblee-nettoye/Dossiers_Legislatifs_XV_nettoye/documents/**/*.json
Test loading everything in memory
Test loading small split files
npx babel-node --extensions ".ts" --max-old-space-size=2048 -- src/scripts/test_load.ts ../assemblee-data/
Test loading big non-split files
npx babel-node --extensions ".ts" --max-old-space-size=2048 -- src/scripts/test_load_big_files.ts ../assemblee-data/
Note: The big non-split open data files should not be used. Use small split files instead.
Initial generation of TypeScript & JSON schema files from JSON data.
npx quicktype --acronym-style=camel -o src/raw_types/acteurs_et_organes.ts ../assemblee-data/AMO{10,20,30,40,50}_*.json
Edit
src/raw_types/acteurs_et_organes.ts
to:- Replace
r("Secretaire02")
with""
. - Remove 2 definitions of
Secretaire02
and replace it withstring
elsewhere. - Replace regular expression
r\("PaysNaisEnum"\)
with""
. - Remove definitions of regular expression
[^ ]PaysNaisEnum
and replace it withstring
.
npx quicktype --acronym-style=camel -o src/raw_types/agendas.ts ../assemblee-data/Agenda_{XIV,XV}.json
Edit
src/raw_types/agendas.ts
to:- Replace
r("SessionRef")
with""
. - Remove 2 definitions of
SessionRef
and replace it withstring
elsewhere.
npx babel-node --extensions ".ts" --max-old-space-size=8192 -- src/scripts/raw_types_from_amendements.ts ../assemblee-data/
Edit
src/raw_types/amendements.ts
to:- Replace
r("ActeurRefElement")
with""
. - Remove 2 definitions of
ActeurRefElement
and replace it withstring
elsewhere. - Replace
r("AuteurRapporteurOrganeRefEnum")
with""
. - Remove 2 definitions of
AuteurRapporteurOrganeRefEnum
and replace it withstring
elsewhere. - Replace
r("Code")
with""
. - Remove 2 definitions of
Code
and replace it withstring
elsewhere. - Replace
r("CodeMissionMinefi")
with""
. - Remove 2 definitions of
CodeMissionMinefi
and replace it withstring
elsewhere. - Replace
r("DivisionRattacheeEnum")
with""
. - Remove 2 definitions of
DivisionRattacheeEnum
and replace it withstring
elsewhere. - Replace
r("GouvernementRefEnum")
with""
. - Remove 2 definitions of
GouvernementRefEnum
and replace it withstring
elsewhere. - Replace
r("GroupePolitiqueRefEnum")
with""
. - Remove 2 definitions of
GroupePolitiqueRefEnum
and replace it withstring
elsewhere. - Replace
r("LigneCreditLibelle")
with""
. - Remove 2 definitions of
LigneCreditLibelle
and replace it withstring
elsewhere. - Replace
r("PrefixeOrganeExamen")
with""
. - Remove 2 definitions of
PrefixeOrganeExamen
and replace it withstring
elsewhere. - Add:
textesLegislatifs: TexteLegislatif[]
}
```- Add:
amendement: Amendement
}
```- Add:
"Amendements": o([
{ json: "textesLegislatifs", js: "textesLegislatifs", typ: a(r("TexteLegislatif")) },
], false),
```- Add:
"AmendementWrapper": o([
{ json: "amendement", js: "amendement", typ: r("Amendement") },
], false),
``
- Add the following static methods to class
Convert`:```typescript
public static toAmendements(json: string): Amendements {
return cast(JSON.parse(json), r("Amendements"));
}
public static amendementsToJson(value: Amendements): string {
return JSON.stringify(uncast(value, r("Amendements")), null, 2);
}
public static toAmendementWrapper(json: string): AmendementWrapper {
return cast(JSON.parse(json), r("AmendementWrapper"));
}
public static amendementWrapperToJson(value: AmendementWrapper): string {
return JSON.stringify(uncast(value, r("AmendementWrapper")), null, 2);
}
```npx quicktype --acronym-style=camel -o src/raw_types/dossiers_legislatifs.ts ../assemblee-data/Dossiers_Legislatifs_{XIV,XV}.json
Edit
src/raw_types/dossiers_legislatifs.ts
to:- Replace
r("FamCode")
with""
. - Remove 2 definitions of
FamCode
and replace it withstring
elsewhere. - Replace regular expression
r\(".+CodeActe"\)
withr("CodeActe")
. - Remove definitions of regular expression
[^ ]+CodeActe;
and replace it withCodeActe;
. - Add
import { CodeActe } from "../shared_types/codes_actes"
on top of file. - Remove occurrences of
"[^"]*CodeActe":
and replace them with one"CodeActe": Object.values(CodeActe),
. - Remove occurrences of
CodeActe \{
. - Replace regular expression
r\(".*OrganeRef"\)
with""
. - Remove definitions of regular expression
[^ ]*OrganeRef
and replace it withstring
. - Replace regular expression
r\(".*DossierRef"\)
with""
. - Remove definitions of regular expression
[^ ]*DossierRef
and replace it withstring
. - Replace regular expression
r\(".*AuteurMotion"\)
with""
. - Remove definitions of regular expression
[^ ]*AuteurMotion
and replace it withstring
. - Replace regular expression
r\(".*DenominationStructurelle"\)
except forDocumentDenominationStructurelle
with""
. - Remove 2 definitions of regular expression
[^ ]*DenominationStructurelle
except forDocumentDenominationStructurelle
and replace it withstring
.
npx babel-node --extensions ".ts" -- src/scripts/merge_scrutins.ts -v ../assemblee-data/
npx quicktype --acronym-style=camel -o src/raw_types/scrutins.ts ../assemblee-data/Scrutins_{XIV,XV_fusionne}.json
Edit
src/raw_types/scrutins.ts
to:- Replace
r("ActeurRef")
with""
. - Remove 2 definitions of
ActeurRef
and replace it withstring
elsewhere. - Replace
r("GroupeOrganeRef")
with""
. - Remove 2 definitions of
GroupeOrganeRef
and replace it withstring
elsewhere. - Replace
r("MandatRef")
with""
. - Remove 2 definitions of
MandatRef
and replace it withstring
elsewhere. - Replace
r("ScrutinOrganeRef")
with""
. - Remove 2 definitions of
ScrutinOrganeRef
and replace it withstring
elsewhere. - Replace
r("SessionRef")
with""
. - Remove 2 definitions of
SessionRef
and replace it withstring
elsewhere. - Add:
scrutin: Scrutin
}
```- Add:
"ScrutinWrapper": o([
{ json: "scrutin", js: "scrutin", typ: r("Scrutin") },
], false),
``
- Add the following static methods to class
Convert`:```typescript
public static toScrutinWrapper(json: string): ScrutinWrapper {
return cast(JSON.parse(json), r("ScrutinWrapper"));
}
public static scrutinWrapperToJson(value: ScrutinWrapper): string {
return JSON.stringify(uncast(value, r("ScrutinWrapper")), null, 2);
}
```Updating JSON schema files and validating JSON files
- Convert src/types/.ts into JSON schemas for comparison purposes
for f in src/types/*.ts ; do b=$(basename $f .ts) ; npx typescript-json-schema src/types/$b.ts '*' > src/schemas/converted_from_type/$b.json ; done
- Manually update src/schemas//.json to account for these differences
- Verify the JSON files validate with the updated schema
npx babel-node --extensions .ts -- src/scripts/validate_json.ts --repository=$(git rev-parse --show-toplevel) --dataset ../data/assemblee-nettoye/AMO*nettoye
npx babel-node --extensions .ts -- src/scripts/validate_json.ts --repository=$(git rev-parse --show-toplevel) --dataset ../data/assemblee-nettoye/Dossiers_Legislatifs_XV_nettoye
etc.
If an error occurs and the schema must be fixed:
- Verify the schema works by using --dev to use the schema from the current working directory instead of fetching them from the tag maching the version mentionned in the JSON file. For instance, if the file
acteurs/PA766283.json
hasschemaVersion = "acteur-1.0"
it will use the schema found at schema-acteur-1.0 and not the current working directory, except if --dev is used. - Once the schema is verified to work, add a tag matching the directory of the schema. For instance for
amendement/Amendement.json
or any of its references (i.e.amendement/*.json
), set the tagschema-amendement-X.Y
.
The tag with the highest version will be used by
src/scripts/clean_reorganized_data.ts
to add a schemaVersion
field for all JSON files created in a *_nettoye
repository from that point on. The goal is for a JSON file to validate against an immutable schema identified by a version tag and to all each JSON file to have a different version of the schema.See the discussion in the forum for more information and further discussion.
Helpers to create documentation
$ npx babel-node --extensions .ts -- src/scripts/document_dossiers_legislatifs.ts --data ../data/assemblee-nettoye/Dossiers_Legislatifs_{XIV,XV}_nettoye/dossiers/**/*.json
See the data-site README for more information about how it is used.
Obsolete or Now Useless Scripts
Validation & cleaning of big non-split files
npx babel-node --extensions ".ts" --max-old-space-size=8192 -- src/scripts/clean_data.ts ../assemblee-data/
Note: The big non-split open data files should not be used. Use small split files instead.
Retrieval of députés' non open data informations from Assemblée nationale's website
npx babel-node --extensions ".ts" -- src/scripts/retrieve_deputes_infos.ts --fetch --parse ../assemblee-data/