Skip to content

vcvpaiva/NLIMath

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Math Concept Identification and NLI Math

A repo for experiments in "math concept identification" using the TAC corpus and the nLab corpus. (and first thoughts about NLI for mathematics.)

The TAC corpus can be found at https://github.com/ToposInstitute/tac-corpus.

A selection of 436 sentences of the TAC corpus (some are empty), selected by size (not too big, not too small) and lack of LaTeX is in https://github.com/ToposInstitute/tac-corpus/blob/main/golden-attempt/examples.txt and is repeated here for convenience both as Experiment2.txt in the folder Experiment436 and as the file 436sentences.txt

The nLab corpus (from around 2020) is at https://github.com/ToposInstitute/nlab-corpus.

Short guidelines for mathematician annotation already agreed:

  1. Try to treat math concepts as black boxes, as much as possible.

  2. Use the singular, instead of the plural, for concepts. Use no Capitals for concepts, as much as possible.

  3. If one has a long span that is a concept, e.g. “enriched accessible categories”, we should also list the sensible subspans like “accessible category”.

A subset of the sentences have no mathematical concepts at all, e.g. "Further applications are given."

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages