1
Commit Graph

23 Commits

Author SHA1 Message Date
stephenmk
7bff70b71c
JMdict: Ensure part-of-speech info is added in non-English versions
Only English-language senses in JMdict contain part-of-speech tags.
This info is displayed to users in definition tags and also used
for deinflecting verbs and adjectives during term lookups.

The old version of Yomichan-Import took the PoS tags from the final
sense in the English version of an entry and applied them to every
sense of every other language. For example, 川・かわ has two senses in
English JMdict: a noun sense and a suffix sense. Therefore every sense
of 川・かわ in every other language was tagged as a suffix.

Instead, I suggest gathering all distinct PoS tags from each English
entry and applying them all to each non-English sense. Every
non-English sense of 川・かわ will therefore be tagged as both a noun
and suffix.
2023-02-02 10:44:16 -06:00
stephenmk
19d6d0bb43
Rename some jmdict functions 2023-02-01 19:14:37 -06:00
stephenmk
b826dbf264
Add verification logic for date entry in JMdict
Very old versions of JMdict and unofficial versions are unlikely to
have the publication date entry at the end of the file.
2023-01-30 13:26:26 -06:00
stephenmk
8b4b899959
Hide new JMdict structured content features behind "extra" option
Require `-language=english_extra` to produce the complete version of
the new JMdict dictionary file.

If and when we determine that the all the new features are ready to be
included the dictionary by default, we can remove this logic.
2023-01-29 14:06:50 -06:00
stephenmk
abbe183145
Simplify logic for index.json struct 2023-01-28 18:39:08 -06:00
stephenmk
517ef3d052
Fix bug in term score assignments
This commit ensures that terms are grouped among their entries of
origin and displayed in correct sequential order in Yomichan's default
result grouping mode, "Group term-reading pairs."
2023-01-27 19:09:12 -06:00
stephenmk
7bd967915c
Add "forms" term in special circumstances
If a headword appears in multiple entries, then each entry needs a
corresponding "forms" term in the output dictionary.

For example, 軽卒 is the only headword in entry 2275730, but 軽卒 also
appears as an irregular form in entry 1252910. If a "forms" term is
not included for the former entry, then it will appear that 軽卒 is
irregular for all senses in the output dictionary.
2023-01-25 18:26:47 -06:00
stephenmk
96358e3eb5
Fix function parameter
Sense numbers start at 1, not 0
2023-01-24 08:55:24 -06:00
stephenmk
d8a3b420ee
Exclude "search" and "forms" terms from non-English dictionaries
This allows a user to install the English version and another version
without cluttering their setup with duplicated information.

If a user doesn't want to use the English version, they can get the
"search" and "forms" terms by installing the separate jmdict_forms
file.
2023-01-22 17:55:27 -06:00
stephenmk
abc28bb19d
Add new JMdict version 2023-01-22 14:37:18 -06:00
0e0e449e7e Cleanup 2016-08-06 18:17:02 -07:00
7177da9e1e Cleanup 2016-08-06 11:19:43 -07:00
b40f4a757f WIP 2016-08-05 22:14:29 -07:00
108a0b28e9 WIP 2016-08-05 21:29:35 -07:00
fd9cb2bcff WIP 2016-08-05 21:07:50 -07:00
9708aef745 Conditional pretty json output 2016-08-03 09:12:31 -07:00
9a436ce9a0 WIP 2016-08-02 22:21:06 -07:00
e2d11e2cda WIP 2016-08-02 20:57:47 -07:00
1f467394b2 Json output 2016-08-02 20:55:37 -07:00
638e275114 WIP 2016-08-01 22:10:55 -07:00
300f838a95 WIP 2016-08-01 20:25:10 -07:00
603486f8a4 WIP 2016-08-01 08:53:14 -07:00
724cb5ed2b WIP 2016-07-31 20:37:04 -07:00