Documentation

Web Application

1. Parse

To get started, simply paste your list of citation references into the textarea above. AnyStyle processes one reference per line so please make sure each reference starts on a new line and remove any superfluous line breaks. Empty lines are fine, though, the parser will just skip them.

When you're ready, hit the parse button!

2. Edit

AnyStyle splits your references into segments (author, title etc.) based on machine learning heuristics. These segments will be displayed in the token editor above. Please review each segment to make sure the results are correct; in case the parser got something wrong just select individual tokens like you would select text in your favourite text editor and click on the Assign label button to assign the correct label to your selection.

Pro tip: use Shift and Ctrl/Command to make multiple selections or double-click to select an entire segment at once.

When using the token editor, note that the parser requires every token to be assigned a label and the individual segments to be contiguous. For the best results combine tokens which semantically belong together. The word in, for example, typically belongs to either the editor or container-title segments (in fact it is a good indicator for those fields). When in doubt how to label a reference, please get in touch with us or take a look at our core training data for examples of correctly labelled references.

3. Save

That's it! When you've reviewed the segments just click on one of the available output formats to convert and save your references.

But there is more: because AnyStyle is based on machine learning you can help us improve! If the parser produces poor results for your citation style or language, just use the token editor to correctly label a handful of references; when you save the results, we will extract your adjustments and use them to train the parser; give us a few minutes to crunch numbers and try to parse your references again using the updated model: we hope the results will be much improved!

Pro tip: the current model's timestamp is included at the bottom of the parsed results; when training the model, keep an eye on the time to see whether or not your training data was already merged into the model.

Please note that we receive a lot of training data; too much or inconsistently-labelled data can cause the model to deteriorate so we may reset the model from time to time. Please let us know if training the model does not work for you or if your parse results are poor so we can take a look. Also, if you'd be interested in helping us curate the training data, your help is much appreciated!

RubyGem

AnyStyle is open source software and freely available as a RubyGem!

$ [sudo] gem install anystyle

After installing the Gem you can start parsing references on your own computer directly from Ruby like this:

>> require "anystyle"

>> AnyStyle.parse """
    Turing, Alan, Computing Machinery and Intelligence, Mind 59, pp 433-460 (1950)
   """
=> [{
     :type => "article-journal",
     :author => [{ :family => "Turing", :given => "Alan" }],
     :date => ["1950"],
     :title => ["Computing Machinery and Intelligence"],
     :"container-title" => ["Mind"],
     :volume => ["59"],
     :pages => ["433–460"],
     :language => "en",
     :scripts => ["Common", "Latin"]
   }]

The RubyGem also includes a finder module to extract references from ful-text PDF documents.
For more details on how to use and train the parser and finder, please consult the API documentation.

On the Command Line

AnyStyle also has a command-line interface.

$ [sudo] gem install anystyle-cli

After installing the Gem you can start parsing references on your own computer directly from Ruby like this:

$ anystyle --help

NAME
    anystyle - Finds and parses bibliographic references

SYNOPSIS
    anystyle [global options] command [command options] [arguments...]

VERSION
    1.3.3 (cli 1.2.0, data 1.2.0)

GLOBAL OPTIONS
    -F, --finder-model=file - Set the finder model file (default: none)
    -P, --parser-model=file - Set the parser model file (default: none)
    --adapter=name          - Set the dictionary adapter (default: ruby)
    -f, --format=name       - Set the output format (default: ["json"])
    --pdfinfo=path          - Set the path for pdfinfo (default: none)
    --pdftotext=path        - Set the path for pdftotext (default: none)
    --help                  - Show this message
    --[no-]stdout           - Print results directly to stdout
    --[no-]verbose          - Print status messages to stderr
    --version               - Display the program version
    -w, --[no-]overwrite    - Allow overwriting existing files

COMMANDS
    check   - Check tagged documents or references
    find    - Find and extract references from text documents
    help    - Shows a list of commands or help for one command
    license - Print license information
    parse   - Parse and convert references
    train   - Create a new finder or parser model