首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >将内容文件筛选到表中

将内容文件筛选到表中
EN

Stack Overflow用户
提问于 2015-06-26 10:34:08
回答 2查看 66关注 0票数 1

这是我所产生的输入,它显示了Jany和Marco在不同时间的课程版本。

代码语言:javascript
运行
复制
on 10:00 the course of jany 1 is :
course:theory:nothing
course:applicaton:onehour

on 10:00 the course of jany 2 is :
course:theory:math
course:applicaton:twohour

on 10:00 the course of Marco 1 is :
course:theory:geo
course:applicaton:halfhour

on 10:00 the course of Marco 2 is :
course:theory:history
course:applicaton:nothing

on 14:00 the course of jany 1 is :
course:theory:nothing
course:applicaton:twohours

on 14:00 the course of jany 2 is :
course:theory:music
course:applicaton:twohours

on 14:00 the course of Marco 1 is :
course:theory:programmation
course:applicaton:onehours

on 14:00 the course of Marco 2 is :
course:theory:philosophy
course:applicaton:nothing

使用awk命令,我成功地对其进行了排序:

代码语言:javascript
运行
复制
awk -F '[\ :]' '/the course of/{h=$2;m=$3} /theory/{print " "h":"m" theory:"$3}' f.txt
awk -F '[\ :]' '/the course of/{h=$2;m=$3} /application/{print " "h":"m" application :"$3}' f.txt
代码语言:javascript
运行
复制
10:00 theory:nothing
14:00 theory:nothing

10:00 application:onehour
14:00 application:twohours

现在,我想通过添加名称( jany,Marco)和版本(1或2)来改进过滤器,如下所示。

代码语言:javascript
运行
复制
Jany 1,10:00,14:00
theory,nothing,nothing
application,onehour,twohour

Jany 2,10:00,14:00
theory,math,music
application,twohour,twohour

Marco 1,10:00,14:00
theory,geo,programmation
application,halfhour,onehour

Marco 2,10:00,14:00
theory,history,philosoohy
application,nothing,nothing

我被困在如何提取‘名称,号码’和获取信息,涉及他们的课程在一个排序和过滤表。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2015-06-26 11:13:09

试试这个:

代码语言:javascript
运行
复制
BEGIN {
    # set records separated by empty lines
    RS=""
    # set fields separated by newline, each record has 3 fields
    FS="\n"
}
{
    # remove undesired parts of every first line of a record
    sub("the course of ", "", $1)
    sub(" is :", "", $1)
    sub("on ", "", $1)
    # now store the rest in time and course
    time=$1
    course=$1
    # remove time from string to extract the course title
    sub("^[^ ]* ", "", course)
    # remove course title to retrieve time from string
    sub(course, "", time)
    # get theory info from second line per record
    sub("course:theory:", "", $2)
    # get application info from third line
    sub("course:applicaton:", "", $3)
    # if new course
    if (! (course in header)) {
        # save header information (first words of each line in output)
        header[course] = course
        theory[course] = "theory"
        app[course] = "application"
    }
    # append the relevant info to the output strings
    header[course] = header[course] "," time
    theory[course] = theory[course] "," $2
    app[course] = app[course] "," $3

}
END {
    # now for each course found
    for (key in header) {
        # print the strings constructed
        print header[key]
        print theory[key]
        print app[key]
        print ""
}

我希望这些评论是不言自明的,如果你对剧本有疑问,一定要问他们。

票数 1
EN

Stack Overflow用户

发布于 2015-06-26 15:56:49

使用GNU实现真正的多维数组和sorted_in:

代码语言:javascript
运行
复制
$ cat tst.awk
BEGIN{ RS=""; FS="[[:space:]:]+" }
{
    for (i=11; i<=NF; i+=3) {
        sched[$7" "$8][$2":"$3][$i] = $(i+1)
        courses[$i]
    }
}
END {
    PROCINFO["sorted_in"] = "@ind_str_asc"
    for (name in sched) {
        printf "%s", name
        for (time in sched[name]) {
            printf ",%s", time
        }
        print ""
        for (course in courses) {
            printf "%s", course
            for (time in sched[name]) {
                printf ",%s", sched[name][time][course]
            }
            print ""
        }
        print ""
    }
}

代码语言:javascript
运行
复制
$ gawk -f tst.awk file
Marco 1,10:00,14:00
applicaton,halfhour,onehours
theory,geo,programmation

Marco 2,10:00,14:00
applicaton,nothing,nothing
theory,history,philosophy

jany 1,10:00,14:00
applicaton,onehour,twohours
theory,nothing,nothing

jany 2,10:00,14:00
applicaton,twohour,twohours
theory,math,music

它并不能准确地产生您发布的预期输出,但我认为这是因为您发布的预期输出是错误的(例如,与输入相比,检查jany 1应用程序的输出14:00 -输入是twohours,就像我的脚本生成的那样,但您说预期的输出是halfhour)。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/31070972

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档