Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
offline_speech_translation [2020/02/03 12:36]
mturchi
offline_speech_translation [2020/05/21 05:29] (current)
mfederico
Line 27: Line 27:
   - The audio files are NOT segmented.   - The audio files are NOT segmented.
  
-To measure the progress in the ST field, each participant is required to translate also the [[http://​i13pc106.ira.uka.de/​~jniehues/​IWSLT-SLT/​data/​eval/​en-de/​IWSLT-SLT.tst2019.en-de.tgz|2019 test set]] that is still blind. Similar to this year test set, the 2019 test set will be made available with and without automatic segmentation.+To measure the progress in the ST field, each participant is required to translate also the 2019 test set that is still blind. Similar to this year test set, the 2019 test set will be made available with and without automatic segmentation.
  
 +== __**Test sets**__ ==:
 +
 +2019:
 +  -   ​[[http://​i13pc106.ira.uka.de/​~jniehues/​IWSLT-SLT/​data/​eval/​en-de/​unsegmented/​IWSLT-SLT.unsegmented.tst2019.en-de.tgz|Unsegmented]]
 +  -   ​[[http://​i13pc106.ira.uka.de/​~jniehues/​IWSLT-SLT/​data/​eval/​en-de/​segmented/​IWSLT-SLT.segmented.tst2019.en-de.tgz|Segmented]]
 +
 +2020:
 +   ​- ​  ​[[http://​i13pc106.ira.uka.de/​~jniehues/​IWSLT-SLT/​data/​eval/​en-de/​unsegmented/​IWSLT-SLT.unsegmented.tst2020.en-de.tgz|Unsegmented]]
 +   ​- ​  ​[[http://​i13pc106.ira.uka.de/​~jniehues/​IWSLT-SLT/​data/​eval/​en-de/​segmented/​IWSLT-SLT.segmented.tst2020.en-de.tgz|Segmented]]
  
 ===Past Editions Development Data=== ===Past Editions Development Data===
Line 77: Line 86:
   * The TAR archive should include in the file name the type of system (cascade/​end-to-end) used to generate the submission   * The TAR archive should include in the file name the type of system (cascade/​end-to-end) used to generate the submission
   * Each run has to be stored in a plain text file with one sentence per line   * Each run has to be stored in a plain text file with one sentence per line
-  * Scoring will be case-sensitive and including the punctuation. Submissions have to be in UTF-8.+  * Scoring will be case-sensitive and including the punctuation. Submissions have to be in UTF-8. Tags such as applause, laughing etc are not considered during the evaluation.
  
 TAR archive file structure: TAR archive file structure:
Line 89: Line 98:
 <​Task>​ =  <​fromLID>​-<​toLID>​ <​Task>​ =  <​fromLID>​-<​toLID>​
 <​fromLID>,​ <​toLID>​ = Language identifiers (LIDs) as given by ISO 639-1 codes; see for example the WIT3 webpage ​ <​fromLID>,​ <​toLID>​ = Language identifiers (LIDs) as given by ISO 639-1 codes; see for example the WIT3 webpage ​
 +
 +All the submissions should be sent to this address: <​iwslt_offline_task_submission@fbk.eu>​
 +
 +The email should include the following information:​
 +
 +  * Institute:
 +  * Contact Person:
 +  * Email:
 +  * Data condition: Constraint/​Unconstraint ​
 +  * Segmentation:​ Own/Given
 +  * Brief abstract about the system:
 +  * Do you want to make your submissions freely available for research purposes? (yes/no)
 +
 +
  
  
Line 100: Line 123:
 Jan Niehues (Maastricht University, Netherland) \\ Jan Niehues (Maastricht University, Netherland) \\
 Matteo Negri (FBK, Italy)\\ Matteo Negri (FBK, Italy)\\
 +Roldano Cattoni (FBK, Italy)