• Title/Summary/Keyword: PDF Parser

Search Result 1, Processing Time 0.015 seconds

Ice Hockey Research Data Platform from Official Records Data and Verification

  • Jin, Seung-kyo;Jang, Ji-hyun;Kim, Hye-young;Kim, Sun-tae
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.9 no.4
    • /
    • pp.31-45
    • /
    • 2019
  • In this study, a database was established by analyzing the record data research produced in ice hockey sports. The deployed data verification with Ice hockey reference service was demonstrated with ice hockey officials and players. This research utilized the data stored in the KNSU Datanest data repository and developed PDF parsers for batch processing of records. Among the types of records, the game summary, team roster, team statistics, and player statistics files were collected, and tables were extracted from the records. PDF records were converted to text in CSV format which are converted to DataFrame and loaded into the database. Out of the total 22 types of records, 4 types were constructed with OO data parsed as element values. Data verification has found no problems with the quality of the data deployed, showing a high satisfaction with providing 66 factors against the 30 factors provided by the service previously used.