Open Bilbio corpus for content analysis

Descripción

Description of the corpus

The corpus describes fulltexts publication in sciences (mathemtaics, computing, statistics) in LATEX or TXT format.
They are published in open access.

Purprose to use this corpus is twice :

  • information extraction (for instance: extract all collocations around a target word, or extract methods names)
  • comparison of abstract and body text

size of publication corpus : 650,000
size of publication sample : 20

data :

body string text data

Author

This dataset has been published on the initiative and under the responsibility of nicolas turenne
Published on 1 de diciembre de 2016 and updated on 2 de diciembre de 2016

Latest update

12 de octubre de 2023

Licencia

Creative Commons Attribution

Metadata quality
77.77777777777779/100

Spatial coverage not set

Some files are unavailable

There are no reuses for this dataset yet.

Publish a reuse What's a reuse ?

There are no discussions for this dataset yet.

There are no community resources for this dataset yet.

Share your resources Learn more about the community

Information

Etiquetas

ID

5840026288ee383a2cc65bb3

Temporality

Creation

1 de diciembre de 2016

Frequency

Bianual

Cobertura temporal

1994/01/01 to 2014/07/01

Latest update

12 de octubre de 2023

Actions

Embed

Statistics for the year

Views

535

3 in may 2024

Downloads

37

Reuses of this dataset

0

Followers

0