推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
nemo95
V2EX  ›  Python

怎么才能高效地将长话单拆分成按小时分解的多条短话单?

  •  
  •   nemo95 · Aug 12, 2021 · 2039 views
    This topic created in 1759 days ago, the information mentioned may be changed or developed.

    原始话单是这样的:

    USERID STARTTIME ENDTIME SERVICETYPE CHANNELCODE PROGRAMNAME
    xxxxxxxxxxxxxxxxx1 2021-05-24 19:52:28 2021-05-24 23:56:27 1 精灵宝可梦

    老板让分析每个用户各个时段都在看啥,所以我想吧话单拆分成这样:

    USERID STARTTIME ENDTIME SERVICETYPE CHANNELCODE PROGRAMNAME PERIODTIME
    xxxxxxxxxxxxxxxxx1 2021-05-24 19:52:28 2021-05-24 20:00:00 1 精灵宝可梦 2021-05-24 19:00:00
    xxxxxxxxxxxxxxxxx1 2021-05-24 20:00:00 2021-05-24 21:00:00 1 精灵宝可梦 2021-05-24 20:00:00
    xxxxxxxxxxxxxxxxx1 2021-05-24 21:00:00 2021-05-24 22:00:00 1 精灵宝可梦 2021-05-24 21:00:00
    xxxxxxxxxxxxxxxxx1 2021-05-24 22:00:00 2021-05-24 23:00:00 1 精灵宝可梦 2021-05-24 22:00:00
    xxxxxxxxxxxxxxxxx1 2021-05-24 23:00:00 2021-05-24 23:56:27 1 精灵宝可梦 2021-05-24 23:00:00

    目前的方法是根据起止时间生成时间序列,然后 for 循环生成新的行,再拼接成一个新的 dataframe

    # split_data
    ...
    for i in range(0, len_date-1):
        df_y['Period'] = date_rng[i]
        df_y['EndTime'] = date_rng[i+1]
        df_y['StartHour'] = date_rng[i]
        df_y['EndHour'] = date_rng[i+1]
        df_x = df_x.append(df_y)
    ...
    

    主函数还要写个 for 循环遍历整个话单

    for i in range(len(df_tmp)):
        df_x = split_data(df_tmp.iloc[i])
        df_t = df_t.append(df_x)
    df_t
    

    这样能获得想要的结果,但话单太多了,跑起来没完没了……

    有没有更好的方法能提高下效率?

    2 replies    2021-08-12 16:19:12 +08:00
    swulling
        1
    swulling  
       Aug 12, 2021
    把话单拆分,多进程跑,不够就多加机器
    princelai
        2
    princelai  
       Aug 12, 2021   ❤️ 1
    有办法,https://paste.ubuntu.com/p/Df6JCbsPn6/

    代码:


    结果:
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   2819 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 36ms · UTC 15:08 · PVG 23:08 · LAX 08:08 · JFK 11:08
    ♥ Do have faith in what you're doing.