我有一个大字符串,我必须把它转换成一个数据帧。例如,字符串是:
meals_string =“开胃菜南方炒鹌鹑配绿、黑浆果、山核桃和蓝芝士14.00公园大道切碎的沙拉山羊Feta Cheese、Nigoise Olive、腌制的白色.主菜辣味硬壳加拿大鲑鱼、马铃薯油炸片、腌制黄瓜、27点香菇烤虾、烤番茄Vinaigrette & Sweet Corn 29.50”
meals = meals_string.splitlines(),这就给了我一个列表,但是我不得不用3列将字符串转换成数据格式:分类;Meal_name;价格。
发布于 2018-01-21 23:09:25
可以为您的字符串构建一个相对简单的解析器,并直接传递给pandas.DataFrame,如下所示:
代码:
def meal_string_parser(meal_string):
    category = ''
    meal = []
    price = 0
    for word in meal_string.split():
        if word:
            try:
                price = float(word)
                yield category, ' '.join(meal), price
                meal = []
            except ValueError:
                # this is not a number, so not a price
                if word.upper() == word and word.isalnum():
                    # found category
                    category = word
                else:
                    meal.append(word)
    if meal:
        yield category, ' '.join(meal), price测试代码:
meals_string = """
APPETIZERS 
    Southern Fried Quail with Greens,Huckleberries,Pecans & Blue Cheese 14.00
    Park Avenue Cafe Chopped Salad Goat Feta Cheese,Nigoise Olives,Marinated White 13.00 
ENTREES
    Horseradish Crusted Canadian Salmon,Potato Fritters, Marinated Cucumbers,Chive Vinaigrette 27.00
    Sautéed Prawns with Mushroom Tortellini,Grilled Tomato Vinaigrette & Sweet Corn 29.50
"""
import pandas as pd
df = pd.DataFrame(meal_string_parser(meals_string),
                  columns='Category Meal_name Price'.split())
print(df)结果:
     Category                                          Meal_name  Price
0  APPETIZERS  Southern Fried Quail with Greens,Huckleberries...   14.0
1  APPETIZERS  Park Avenue Cafe Chopped Salad Goat Feta Chees...   13.0
2     ENTREES  Horseradish Crusted Canadian Salmon,Potato Fri...   27.0
3     ENTREES  Sautéed Prawns with Mushroom Tortellini,Grille...   29.5https://stackoverflow.com/questions/48367861
复制相似问题