List of Images

Figure 1. Coverage and size of older versions of the ParlaMint corpora by country/region (v. 2.1 versus v. 4.1). (Credit: Matyáš Kopp).
Figure 2. Coverage of ParlaMint II corpora. (Source: Erjavec et al., 2025).
Figure 3. Number of speeches produced in the lower chamber/assembly in a full 4-year period (2018–2021) based on ParlaMint data.
Figure 4. An excerpt from the TEITOK visualisation of a plenary debate from ParlaMint-ES. (source)
Figure 5. Reaching lists of various elements from the corpus description page of ParlaMint-CZ. (source)
Figure 6. Details on speakers from ParlaMint-NL provided in TEITOK. (source)
Figure 7. Constructing a query in TEITOK.
Figure 8. Displaying the context for the selected line.
Figure 9. ParlaMint provenance information in the CLARIN.SI repository (Part 1).
Figure 10. ParlaMint provenance information in the CLARIN.SI repository (Part 2).
Figure 11. Structural elements in ParlaMint-PL. (source)
Figure 12. An example sentence from ParlaMint-PL showing Token ID subscripted in grey.
Figure 13. Linguistic elements in ParlaMint-PL. (source)
Figure 14. A part of the list from ParlaMint-PL enumerating the lemma głosować (to vote) and its different word forms (source) with UD features added for illustrative purposes.
Figure 15. The beginning of a sentence from ParlaMint-PL showing the original word (the upper line) subscripted by annotations in the following order lemma/word_lc/lemma_lc (with lc standing for lowercase).
Figure 16. An example from ParlaMint-PL showing the three levels of annotations: pos, feats and dep.
Figure 17. The View options button.
Figure 18. The Text Type section of the noSketch Engine ParlaMint-BE corpus information page grouping metadata categories.
Figure 19. Metadata categories in ParlaMint-BE on speakers (in the red box) and speeches (in the blue boxes).
Figure 20. Text type analysis page in noSketch Engine for ParlaMint-BE showing the statistics for speaker gender category.
Figure 21. The Frequency analysis button.
Figure 22. The Frequency analysis section with the tab Basic above allowing easy access through presets, and the tab Advanced below allowing user-defined combination of metadata categories.
Figure 23. Topic annotation schema consisting of the 21 CAP categories expanded with categories Other and Mix.
Figure 24. Sentence-level metadata in ParlaMint-BE: sentence ID and sentiment information on three different scales.
Figure 25. Metadata categories of content and type at the note structural level in ParlaMint-FR.
Figure 26. Displaying notes in context by selecting note under View options. Check that you have Show more context and KWIC (at the top of the page) selected to get the best view.
Figure 27. Listing all combinations of note type and content by conducting Frequency analysis and selecting the two metadata categories on the Advanced tab.
Figure 28. The name structure with its type category and the distribution of the four values across all parliaments included in the corpus (ParlaMint-XX).
Figure 29. Listing all different occurrences of the selected NER type.
Figure 30. The top-level USAS semantic domains.
Figure 31. The three metadata categories encoding semantic domains.
Figure 32. Three examples of semantic annotations with an indication of the original tagger output (USAS tag) followed by the simplified semantic label (USAS category) after slash.
Figure 33. Topic distribution between women (red) and men (blue). (source: Ljubešić et al. (n.a.)).
Figure 34. Corpus selection for Showcase I – ParlaMint-XX-en 5.0.
Figure 35. Select Manage corpus, Subcorpora and Create subcorpus as the first step in metadata-based subcorpus creation.
Figure 36. Name the subcorpus and select the appropriate metadata values.
Figure 37. Select the Text type analysis function.
Figure 38. Define the relevant parameters to compute the frequency list.
Figure 39. Click the download button and select the preferred format.
Figure 40. Topic distribution in speeches by women and men MPs. Topics previously identified as gender-dominant are circled (red = women, blue = men).
Figure 41. Display the concordance lines of regularMP_F for the topic of Health.
Figure 42. The deactivated + symbol otherwise used to create a subcorpus from concordance lines.
Figure 43. CQL query information on the Concordance page.
Figure 44. Enter the CQL query and metadata values under the Advanced tab.
Figure 45. Create the subcorpus of health-related speeches produced by women MPs.
Figure 46. The list of subcorpora created by users.
Figure 47. Select the Keywords function.
Figure 48. Set the parameters for keyword list computation.
Figure 49. Press this button to download the data.
Figure 50. Press this button to change the focus and reference subcorpus.
Figure 51. Press this button to change the parameters for keywords.
Figure 52. Set the parameters for keyword list computation.
Figure 53. An example of the analysis of keywords from speeches on health delivered by men and women MPs from the ParlaMint-GB corpus.
Figure 54. Add regular expression (regex) to exclude proper nouns.
Figure 55. Open the concordances for the first key semantic domain of the macroeconomics_F subcorpus.
Figure 56. Press this button to get a random sample.
Figure 57. Click the word in red and the three dots to show and expand the context.
Figure 58. Open the concordance lines for the first keyword on the list.
Figure 59. Get the Sentence ID by expanding the metadata section on the left side of the sentence itself.
Figure 60. Setting the parameters to display a given sentence in the original language.
Figure 61. Average sentiment across CAP topics and parliaments, with sentiment scores from 1.2 to 3.2, where darker blue indicates more negativity and lighter colours show more positive tone (Source: Ljubešić et. al., n. a.)
Figure 62. Corpus selection for Showcase II – ParlaMint-GB 5.0.
Figure 63. Starting a simple concordance query.
Figure 64. Proceeding with a simple concordance query and limiting your search.
Figure 65. Starting a CQL concordance query.
Figure 66. Running a complex concordance query.
Figure 67. Combining the CQL format with metadata selections in search queries.
Figure 68. Concordance results with numerical details.
Figure 69. An easy path to frequency results for the sentiment annotation of the retrieved concordance lines via the Basic tab.
Figure 70. An advanced path to frequency results for the sentiment annotation of the retrieved concordance lines.
Figure 71. Default and selected frequency information.
Figure 72. Relative frequency of uses of the EU and the European Union before and after Brexit in ParlaMint-GB.
Figure 73. Relative frequency of uses of Europe before and after Brexit in ParlaMint-GB.
Figure 74. Relative density of uses of the EU and the European Union before and after Brexit in ParlaMint-GB.
Figure 75. Relative density of uses of Europe before and after Brexit in ParlaMint-GB.
Figure 76. Frequency information on the use of the EU / European Union in negatively attributed sentences in the Reference subcorpus across parliamentary parties in the Speech.speaker_party_name section on the Frequency page.
Figure 77. Opening concordance lines.
Figure 78. An alternative way of opening concordance lines.
Figure 79. Grouping the selected concordance lines via the Frequency page.
Figure 80. Filtering the selected concordance lines by party name via the Filter page.
Figure 81. Concordance lines with the -0.043 sentiment value uttered by SNP members.
Figure 82. Concordance lines with the -0.043 sentiment value uttered by Conservative Party members.
Figure 83. Adjusting parameters for collocation analysis.
Figure 84. Options for downloading the retrieved collocate list.
Figure 85. Collocates of the EU/European Union in negatively annotated sentences uttered by SNP and Conservative Party members in the post-Brexit period.
Figure 86. Collocates of the EU/European Union in positively annotated sentences uttered by SNP and Conservative Party members in the post-Brexit period.
Figure 87. Concordance lines containing co-occurrences of rejoin and the EU/European Union in positively annotated sentences uttered by SNP members.
Figure 88. Concordance lines containing co-occurrences of rejoin and the EU/European Union in negatively annotated sentences uttered by Conservative Party members.
Figure 89. Concordance lines containing the lemma drag co-occurring with the EU/European Union in negatively annotated sentences uttered by SNP members in the post-Brexit period.
Figure 90. Sentiment information regarding the selected sentences containing the lemma drag according to the 6-level scale (s.senti_6).
Figure 91. Sentiment information regarding the selected sentences containing departure, leave, outside, exit, the collocates of the EU/European Union, uttered by Conservative Party members in the post-Brexit period according to the 6-level schema (s.senti_6).
Figure 92. Creating a subcorpus from selected positively annotated sentences via the Concordance page for keyword analysis.
Figure 93. Identifying keywords in the focus subcorpus as compared with the reference subcorpus of the reference corpus.
Figure 94. Top 20 keywords representing the subcorpus from positive sentences (on the left) and positive mixed sentences (on the right) containing the term EU/European Union and at least one of their four strongest collocates departure, leave, outside, exit, which were uttered by Conservative MPs in the post-Brexit period.