iamtryingtoparseacsvfileasbelow
String NEW_LINE_SEPARATOR='rn ';
csvformatcsvfileformat=CSV format.default.withrecordseparator (new _ line _ separator );
listrecordslist=CSV parser.getrecords (;
nowthefilegotnormallinesendingwithcrlfcharactershoweverforfewlinesthereisadditionallfcharacterappearinginmiddle。
i.e。
a,b,c,dCRLF --line1
e,fLF,g,h,iCRLF --line2
Due to this,theparseoperationcreatesthreerecordswhereasactuallytheyareonlytwo。
isthereawayicangetthelfcharacterappearinginmiddleofsecondlinenottreatedaslinebreakandgettworecordsonlyuponparsing?
ithinkunivocity-parsersistheonlyparseryouwillfindthatwillworkwithlineendingsasyouexpect。
theequivalentcodeusingunivocity-parsers will be :
settings.getFormat ().setlineseparator('rn ' );
settings.getFormat ().setnormalizednewline ((u 0001 ) ); //usesaspecialcharactertorepresentanewrecordinsteadofn
settings.setnormalizelineendingswithinquotes (false; //does not replacernbythenormalizednewlinewhenreadingquotedvalues。
settings.setheaderextractionenabled (true; //extract headers from file
Settings.Trimvalues(false; //doesnotremovewhitespacesaroundvalues
csvparserparser=newcsvparser (settings;
listrecordslist=parser.parseallrecords (new file (' 201404051539.CSV ' ) );
ifyoudefinealineseparatortobernthenthisistheonlysequenceofcharactersthatshouldidentifyanewrecord (whenoutsidequotes ) nwithoutbeingenclosedinquotesbecausethat ' snotthelineseparatorsequence。
whenparsingtheinputsampleyougave :
String input='a,b,c,drne,fn,g,h,irn ';
parser.parse all (new string reader ) input );
The result will be:
LINE1=[a,b,c,d]
LINE2=[e,f]
、g、h、i]
Disclosure: I'm遗忘柜of this library.it ' sopen-sourceandfree (Apache 2.0 license ) )。