Programming Geeks?
Oct. 18th, 2007 02:20 amtitle author journal volume number month year pages abstract
^(.+)(.+)
^(.+)
[Error: Irreparable invalid markup ('<newline\d>') in entry. Owner must fix manually. Raw contents below.]
title author journal volume number month year pages abstract
^(.+)<NewLine1>(.+)<NewLine\d>(.+)[,>]\s*Vol\.\s*(\d+),\s*No\.\s*(\d+).+\((.+),\s*(\d\d\d\d)\),\s*p{1,2}\.([\d|\-|\s]+)\.{0,1}.+Abstract(.+)$
A lot of the above looks like the syntax used in grep to me.
Is this a particular language, or should I just go with the grep commands?
In case you're wondering... the above is a regular expression from the regexps file for cb2Bib, a program that attempts to parse citations or articles in order to sort out the different data categories, and then spit out BibTex/store the data in a database as individual fields that can be sorted usefully etc. It's invaluable in citation analysis... if I can write a regular expression for the citation style the journal we're looking at uses, Harvard Style.
I haven't done any programming or shell type stuff in a long time. I think I can figure this out, but it ain't gonna be easy for me, given how long it's been.
^(.+)<NewLine1>(.+)<NewLine\d>(.+)[,>]\s*Vol\.\s*(\d+),\s*No\.\s*(\d+).+\((.+),\s*(\d\d\d\d)\),\s*p{1,2}\.([\d|\-|\s]+)\.{0,1}.+Abstract(.+)$
A lot of the above looks like the syntax used in grep to me.
Is this a particular language, or should I just go with the grep commands?
In case you're wondering... the above is a regular expression from the regexps file for cb2Bib, a program that attempts to parse citations or articles in order to sort out the different data categories, and then spit out BibTex/store the data in a database as individual fields that can be sorted usefully etc. It's invaluable in citation analysis... if I can write a regular expression for the citation style the journal we're looking at uses, Harvard Style.
I haven't done any programming or shell type stuff in a long time. I think I can figure this out, but it ain't gonna be easy for me, given how long it's been.
no subject
Date: 2007-10-18 01:01 pm (UTC)