REXX - 从CSV文件解析
问题描述:
我在解析文本文件中的CSV时遇到问题,并想知道您是否可以帮助我。到目前为止,我有以下内容:REXX - 从CSV文件解析
CSV文件(DATA.txt)看起来像这样,它总是会有15个字段全部用逗号分隔。并非所有字段都是强制性的,所以有些字段会被填充,有些字段是空白的
Seattle,Lastname,Firstname,DOB,SEX,etc,etc
Seattle,Lastname,Firstname,DOB,,etc,etc
Portland,Lastname,Firstname,DOB,SEX,,,etc
Portland,Lastname,Firstname,DOB,SEX,etc,etc
这里是我的REXX代码
SOURCEFILE = "C:\DATA\DATA.TXT"
IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)
PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
CALL SETCURSOR 4,23
CALL CREATEDATA
END
CREATEDATA:
CALL TYPE CITY
CALL PRESS TAB
CALL TYPE LAST_NAME
CALL PRESS TAB
CALL TYPE DATE(U)
CALL PRESS TAB
CALL TYPE FIRST_NAME
CALL PRESS TAB
CALL PRESS ENTER
RETURN
我不知道我是否应该在解析时使用ARG或VAR或者,如果我写了正确的前两行。事实上,我知道我的CREATEDATA函数正常工作,因为我得到了输入的“CITY”而不是解析值。任何帮助将非常感激。谢谢!
答
几点意见:
1)Lines(SourceFile)
Windows系统可能包括读取整个文件上计算CR-LF序列。然后你的Parse value LineIn(SourceFile)
循环再次读取它。典型的Rexx的方式做,这将是:
Address SYSTEM 'TYPE' SourceFile with output stem Lines.
Do Counter = 1 to Lines.0
Parse var Lines.Counter ...
End
Drop Lines.
至少,只要该文件不是太大,保持它在阵列中存储成本。
2)您在循环结束时流入CreateData
,这就是您看到“CITY”的原因。在End
之后,您需要一个Return
或Exit
指令。
3)根据#2,显然Parse
从未被执行,因为City
未初始化(Rexx中的未初始化变量的值是大写的名称)。它的条件是A=2
,但情况并非如此。
答
一个问题是什么,如果A = 2,那么在
IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)
如果A宗旨!= 2的循环被旁路。我怀疑你的程序应该是:
SOURCEFILE = "C:\DATA\DATA.TXT"
DO COUNTER=1 TO LINES(SOURCEFILE)
PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
CALL SETCURSOR 4,23
CALL CREATEDATA
END
RETURN /* prevent the fall through to createdata */
CREATEDATA:
---------------------------
解析的语句具有以下基本格式
解析[来源] [解析控制]
其中[来源] icludes
ARG - 过程调用参数 拉 - 数据从堆栈中取出 var - 数据来自变量 值...提供数据在线
所以你的解析也能像
linein = LINEIN(SOURCEFILE)
PARSE var linein CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
或
做DO COUNTER=1 TO LINES(SOURCEFILE)
CALL SETCURSOR 4,23
CALL CREATEDATA LINEIN(SOURCEFILE)
END
RETURN /* prevent the fall through to createdata */
CREATEDATA:
parse arg CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
最后屁股罗斯说你应该尝试aviod线(的资源文件),因为它涉及到整个文件读
根据Cowlishaw的Rexx的书,建在函数中的行可能返回的行数中引用的文件或,如果不能确定,“1”,其中一个非零计数将适当,否则为'0'。我在窗口上使用ooRexx相当多,可以确认ooRexx不计算所有行,它只返回0/1。我使用以下命令一次读取一行文件:DO WHILE LINES(filename)> 0; PARSE VALUE LINEIN(文件名)与...; END – NealB 2013-04-12 13:38:18
“Lines()”的结果与实现有关。有些实现会返回一个计数,其他实例只会返回1或0。这也是为什么我更喜欢加载词干:'Lines.0'是一个实际的计数。 – 2013-04-12 23:33:01