The set of files in this package enables to obtain the dataset used in Section 2 of the paper: "A Parser for the Efficient Induction of Biological Grammars" by Chris H. Bryant and Daniel Fredouille (http://www.comp.rgu.ac.uk/staff/chb/research/data_sets/ilp05/). Please refer to the paper for details of these experiments. The package should contain: - README : This file. - middle.b, middle_nonredundant.f, middle.n: The inference files (background knowledge, positives and randoms). - clause_to_grammar_rule.pl : some predicates used to transform clauses. representing grammar rules into something more human readable. - infer.pl : The line to execute in Yap such as to start inference. - run_set_construction : The total command line such as to run inference and save the inferred clauses in files. - split_on_clause_length.py : a script to split the obtained rules in function of their length To obtain the dataset used in the paper, take the following steps: -1) save this package (set_construction.tar.gz) on a directory on your computer -2) uncompress and unarchive it with command lines: gunzip -xf set_construction.tar.gz tar -xf set_construction.tar -3) edit the file "infer.pl" to specify where your Aleph files is saved (changing line 5) -4) run the program with the command: run_set_construction This will create three files in your directory, the two first being the dataset used (can run for a while to create these files): - res_clauses.pl : The set of clauses encountered during inference (one per line). - res_rules.txt : The set of clauses encountered during inference on a more readable format (one per line): a line [middle,aromatic,gap] correspponding to a grammar rule: middle -> aromatic gap. - output.rules : the rules inferred (but you do not really care about this). -5) In addition to the dataset, if you wish to reproduce the experiments of Section 2 of the paper, you will need to download the package set_parsing.tar.gz (http://www.comp.rgu.ac.uk/staff/chb/research/data_sets/ilp05/) Please contact the authors for any questions, you will find their coordinate at: http://www.comp.rgu.ac.uk/staff/df/ or http://www.comp.rgu.ac.uk/staff/chb/