Skip to content
Snippets Groups Projects
  1. May 01, 2018
  2. Apr 26, 2018
  3. Mar 14, 2018
  4. Nov 22, 2017
  5. Sep 26, 2017
  6. Sep 08, 2017
  7. Sep 05, 2017
    • Ed Morley's avatar
      NLTK support: Fix passing of multiple corpora identifiers (#460) · 4212e063
      Ed Morley authored
      * NLTK support: Update test to use multiple corpora
      
      So that the incorrect handling of multiple IDs seen in #444 would
      have been caught.
      
      Also switches to some of the smaller corpora, to reduce time spent
      downloading during tests (see sizes on http://www.nltk.org/nltk_data/).
      
      * NLTK support: Fix passing of multiple corpora identifiers
      
      As part of fixing the shellcheck warnigns in #438, double quotes had
      been placed around `$nltk_packages` passed to the `nltk.downloader`,
      which causes multiple identifiers to be treated as though it were just
      one identifier that contains spaces.
      
      The docs for the shellcheck warning in question recommend using arrays
      if the intended behaviour really is to split on spaces:
      https://github.com/koalaman/shellcheck/wiki/SC2086#exceptions
      
      As such, `readarray` has been used, which is present in bash >=4.
      The `[*]` array form is used in the log message, to prevent shellcheck
      warning SC2145, whereas `[@]` is used when passed to `nltk.downloader`
      to ensure the array elements are unpacked as required.
      
      Note: Both before and after this fix, using anything but unix line
      endings in `nltk.txt` will also cause breakage.
      4212e063
  8. Aug 03, 2017
  9. Jun 20, 2017
  10. Jun 05, 2017
  11. Mar 14, 2017
  12. Mar 10, 2017
  13. Mar 08, 2017
  14. Mar 07, 2017
  15. Feb 01, 2017
  16. Jan 25, 2017
Loading