danaeris | Programming Geeks?

title author journal volume number month year pages abstract
^(.+)(.+)

[Error: Irreparable invalid markup ('<newline\d>') in entry. Owner must fix manually. Raw contents below.]

title author journal volume number month year pages abstract
^(.+)<NewLine1>(.+)<NewLine\d>(.+)[,>]\s*Vol\.\s*(\d+),\s*No\.\s*(\d+).+$(.+),\s*(\d\d\d\d)$,\s*p{1,2}\.([\d|\-|\s]+)\.{0,1}.+Abstract(.+)$

A lot of the above looks like the syntax used in grep to me.

Is this a particular language, or should I just go with the grep commands?

In case you're wondering... the above is a regular expression from the regexps file for cb2Bib, a program that attempts to parse citations or articles in order to sort out the different data categories, and then spit out BibTex/store the data in a database as individual fields that can be sorted usefully etc. It's invaluable in citation analysis... if I can write a regular expression for the citation style the journal we're looking at uses, Harvard Style.

I haven't done any programming or shell type stuff in a long time. I think I can figure this out, but it ain't gonna be easy for me, given how long it's been.