近来大家伙被欧洲中心的数据下载困扰了很久,那么想办法另辟蹊径吧
本文基于旧文基于openmeteo 下载最新EC aifs预报数据进行简单修改
本项目旨在利用 OpenMeteo 平台提供的era5历史数据,进行气象数据分析和可视化。我们将使用 Python 编程语言和相关的气象数据处理工具来实现这一目标。
气象预报数据对于天气预测、气候研究以及环境监测等方面具有重要意义,包含了全球范围内的多种气象要素预报,如温度、湿度、风速等。
通过本项目,我们希望能够深入探索气象数据的价值,并为气象爱好者、科研人员以及气象行业工作者提供有益的工具和资源。
更多数据下载可从 OPEN-METEO API (https://open-meteo.com/en%EF%BC%89 下载气象数据
那么我们开始吧
由于代码过长隐藏,可前往基于open-meteo下载历史ERA5数据多种气象要素点击运行查看 🔜🔜若没有成功加载可视化图,点击运行可以查看 ps:隐藏代码在【代码已被隐藏】所在行,点击所在行,可以看到该行的最右角,会出现个三角形,点击查看即可
!pip install --user openmeteo-requests
!pip install requests-cache retry-requests
import openmeteo_requests
import requests_cache
import pandas as pd
from retry_requests import retry
# Setup the Open-Meteo API client with cache and retry on error
cache_session = requests_cache.CachedSession('.cache', expire_after = -1)
retry_session = retry(cache_session, retries = 5, backoff_factor = 0.2)
openmeteo = openmeteo_requests.Client(session = retry_session)
# Make sure all required weather variables are listed here
# The order of variables in hourly or daily is important to assign them correctly below
url = "https://archive-api.open-meteo.com/v1/archive"
params = {
"latitude": 52.52,
"longitude": 13.41,
"start_date": "2023-03-29",
"end_date": "2023-04-01",
"hourly":["temperature_2m", "relative_humidity_2m", "dew_point_2m", "surface_pressure", "wind_speed_10m"]
}
responses = openmeteo.weather_api(url, params=params)
# Process first location. Add a for-loop for multiple locations or weather models
response = responses[0]
print(f"Coordinates {response.Latitude()}°N {response.Longitude()}°E")
print(f"Elevation {response.Elevation()} m asl")
print(f"Timezone {response.Timezone()} {response.TimezoneAbbreviation()}")
print(f"Timezone difference to GMT+0 {response.UtcOffsetSeconds()} s")
# Process hourly data. The order of variables needs to be the same as requested.
hourly = response.Hourly()
hourly_temperature_2m = hourly.Variables(0).ValuesAsNumpy()
hourly_relative_humidity_2m = hourly.Variables(1).ValuesAsNumpy()
hourly_dew_point_2m = hourly.Variables(2).ValuesAsNumpy()
hourly_surface_pressure = hourly.Variables(3).ValuesAsNumpy()
hourly_wind_speed_10m = hourly.Variables(4).ValuesAsNumpy()
hourly_data = {"date": pd.date_range(
start = pd.to_datetime(hourly.Time(), unit = "s", utc = True),
end = pd.to_datetime(hourly.TimeEnd(), unit = "s", utc = True),
freq = pd.Timedelta(seconds = hourly.Interval()),
inclusive = "left"
)}
hourly_data["temperature_2m"] = hourly_temperature_2m
hourly_data["relative_humidity_2m"] = hourly_relative_humidity_2m
hourly_data["dew_point_2m"] = hourly_dew_point_2m
hourly_data["surface_pressure"] = hourly_surface_pressure
hourly_data["wind_speed_10m"] = hourly_wind_speed_10m
hourly_dataframe = pd.DataFrame(data = hourly_data)
print(hourly_dataframe.head(10))
Coordinates 52.5483283996582°N 13.407821655273438°E
Elevation 38.0 m asl
Timezone None None
Timezone difference to GMT+0 0 s
date temperature_2m relative_humidity_2m \
0 2023-03-29 00:00:00+00:00 1.1085 81.883591
1 2023-03-29 01:00:00+00:00 0.8085 83.363243
2 2023-03-29 02:00:00+00:00 1.6085 78.999466
3 2023-03-29 03:00:00+00:00 2.1085 75.114677
4 2023-03-29 04:00:00+00:00 1.9085 74.802864
5 2023-03-29 05:00:00+00:00 1.0085 77.748978
6 2023-03-29 06:00:00+00:00 1.9585 75.365379
7 2023-03-29 07:00:00+00:00 3.9085 71.423698
8 2023-03-29 08:00:00+00:00 5.9585 67.313850
9 2023-03-29 09:00:00+00:00 7.7585 65.560349
dew_point_2m surface_pressure wind_speed_10m
0 -1.6415 1016.877441 9.007196
1 -1.6915 1016.175476 9.885262
2 -1.6415 1015.393250 11.525623
3 -1.8415 1014.605896 13.009903
4 -2.0915 1014.204285 12.646200
5 -2.4415 1014.088867 11.520000
6 -1.9415 1013.707214 13.138765
7 -0.7915 1013.840393 12.224107
8 0.3585 1013.477295 14.007655
9 1.7085 1013.009827 14.168641
In [25]:
import openmeteo_requests
import requests_cache
import pandas as pd
from retry_requests import retry
import numpy as np
# 设置Open-Meteo API客户端并进行缓存和错误重试
cache_session = requests_cache.CachedSession('.cache', expire_after=3600)
retry_session = retry(cache_session, retries=5, backoff_factor=0.2)
openmeteo = openmeteo_requests.Client(session=retry_session)
# 生成要下载数据的格点经纬度范围
latitude_points = np.arange(52.0, 53.1, 0.25)
longitude_points = np.arange(13.0, 14.1, 0.25)
# 定义要查询的变量和模型
variables = ["temperature_2m", "relative_humidity_2m", "dew_point_2m", "surface_pressure", "wind_speed_10m"]
# 创建一个空列表,用于存储每个格点的数据
dataframes = []
# 循环遍历每个格点并下载数据
for lat in latitude_points:
for lon in longitude_points:
url = "https://archive-api.open-meteo.com/v1/archive"
params = {
"latitude": lat,
"longitude": lon,
"start_date": "2023-03-29",
"end_date": "2023-04-01",
"hourly": ",".join(variables),
}
responses = openmeteo.weather_api(url, params=params)
# 处理响应数据
for response in responses:
hourly = response.Hourly()
hourly_data = {
"date": pd.date_range(
start=pd.to_datetime(hourly.Time(), unit="s", utc=True),
end=pd.to_datetime(hourly.TimeEnd(), unit="s", utc=True),
freq=pd.Timedelta(seconds=hourly.Interval()),
inclusive="left"
)
}
for i, variable in enumerate(variables):
variable_data = hourly.Variables(i).ValuesAsNumpy()
hourly_data[variable] = variable_data
hourly_dataframe = pd.DataFrame(data=hourly_data)
dataframes.append(hourly_dataframe)
# 将所有数据拼接成一个表格
final_dataframe = pd.concat(dataframes, ignore_index=True)
# 打印最终的数据表格
print(final_dataframe)
date temperature_2m relative_humidity_2m \
0 2023-03-29 00:00:00+00:00 1.6220 76.708954
1 2023-03-29 01:00:00+00:00 1.6720 76.153656
2 2023-03-29 02:00:00+00:00 1.6220 76.145111
3 2023-03-29 03:00:00+00:00 1.6220 74.200531
4 2023-03-29 04:00:00+00:00 1.2720 72.501663
... ... ... ...
2395 2023-04-01 19:00:00+00:00 3.7545 77.622520
2396 2023-04-01 20:00:00+00:00 3.0045 79.500145
2397 2023-04-01 21:00:00+00:00 2.6545 79.158913
2398 2023-04-01 22:00:00+00:00 2.3545 78.824844
2399 2023-04-01 23:00:00+00:00 2.2045 77.940964
dew_point_2m surface_pressure wind_speed_10m
0 -2.0280 1008.537415 11.013882
1 -2.0780 1007.848816 12.074766
2 -2.1280 1007.155701 13.339445
3 -2.4780 1006.563538 13.004921
4 -3.1280 1006.151733 12.682018
... ... ... ...
2395 0.2045 1002.548828 22.267679
2396 -0.1955 1003.427124 21.385939
2397 -0.5955 1004.413757 21.758419
2398 -0.9455 1005.302063 22.264771
2399 -1.2455 1006.094055 20.929594
[2400 rows x 6 columns]
In [26]:
%%time
import openmeteo_requests
import requests_cache
import pandas as pd
from retry_requests import retry
import numpy as np
# 设置Open-Meteo API客户端并进行缓存和错误重试
cache_session = requests_cache.CachedSession('.cache', expire_after=3600)
retry_session = retry(cache_session, retries=5, backoff_factor=0.2)
openmeteo = openmeteo_requests.Client(session=retry_session)
# 生成要下载数据的格点经纬度范围
latitude_points = np.arange(45.0, 50.1, 0.25)
longitude_points = np.arange(115.0, 120.1, 0.25)
# 定义要查询的变量和模型
variables = ["temperature_2m", "relative_humidity_2m", "dew_point_2m", "surface_pressure", "wind_speed_10m"]
#model = "ecmwf_aifs025"
# 创建一个空列表,用于存储每个格点的数据
dataframes = []
# 循环遍历每个格点并下载数据
for lat in latitude_points:
for lon in longitude_points:
url = "https://archive-api.open-meteo.com/v1/archive"
params = {
"latitude": lat,
"longitude": lon,
"start_date": "2023-03-29",
"end_date": "2023-04-01",
"hourly": ",".join(variables),
}
responses = openmeteo.weather_api(url, params=params)
# 处理响应数据
for response in responses:
hourly = response.Hourly()
hourly_data = {
"date": pd.date_range(
start=pd.to_datetime(hourly.Time(), unit="s", utc=True),
end=pd.to_datetime(hourly.TimeEnd(), unit="s", utc=True),
freq=pd.Timedelta(seconds=hourly.Interval()),
inclusive="left"
),
"latitude": lat,
"longitude": lon
}
for i, variable in enumerate(variables):
variable_data = hourly.Variables(i).ValuesAsNumpy()
hourly_data[variable] = variable_data
hourly_dataframe = pd.DataFrame(data=hourly_data)
dataframes.append(hourly_dataframe)
# 将所有数据拼接成一个表格
final_dataframe = pd.concat(dataframes, ignore_index=True)
# 打印最终的数据表格
print(final_dataframe)
date latitude longitude temperature_2m \
0 2023-03-29 00:00:00+00:00 45.0 115.0 3.3635
1 2023-03-29 01:00:00+00:00 45.0 115.0 6.2635
2 2023-03-29 02:00:00+00:00 45.0 115.0 8.6135
3 2023-03-29 03:00:00+00:00 45.0 115.0 10.1635
4 2023-03-29 04:00:00+00:00 45.0 115.0 11.1135
... ... ... ... ...
42331 2023-04-01 19:00:00+00:00 50.0 120.0 -1.1890
42332 2023-04-01 20:00:00+00:00 50.0 120.0 -1.7390
42333 2023-04-01 21:00:00+00:00 50.0 120.0 -2.3890
42334 2023-04-01 22:00:00+00:00 50.0 120.0 -2.5390
42335 2023-04-01 23:00:00+00:00 50.0 120.0 -1.0890
relative_humidity_2m dew_point_2m surface_pressure wind_speed_10m
0 25.305822 -14.686501 860.471863 16.075521
1 20.003267 -15.086500 862.864014 14.529915
2 16.967871 -15.136500 863.092651 13.183080
3 16.461998 -14.236501 863.084045 9.565437
4 15.966474 -13.836500 863.203491 7.895416
... ... ... ... ...
42331 80.960815 -4.039000 921.916138 9.346143
42332 74.988991 -5.589000 922.102844 9.826088
42333 71.234772 -6.889000 922.800964 9.422101
42334 69.578217 -7.339000 923.115784 10.594036
42335 64.489380 -6.939000 924.129761 11.525623
[42336 rows x 8 columns]
CPU times: user 2.44 s, sys: 248 ms, total: 2.69 s
Wall time: 2min 17s
以上是5°×5°的空间范围,时间范围为4天的数据,耗时两分多钟
并行下载大大加快了下载速度
但是
请求格点太多,会被警告下个小时才可使用
大家可摸索一下较为合适的范围
In [28]:
# 修正时间数据类型为datetime
final_dataframe['date'] = pd.to_datetime(final_dataframe['date']).dt.tz_localize(None)
# 将DataFrame转换为xarray Dataset
ds = final_dataframe.set_index(['date', 'latitude', 'longitude']).to_xarray()
# 打印xarray Dataset
print(ds)
<xarray.Dataset>
Dimensions: (date: 96, latitude: 21, longitude: 21)
Coordinates:
* date (date) datetime64[ns] 2023-03-29 ... 2023-04-01T23:...
* latitude (latitude) float64 45.0 45.25 45.5 ... 49.5 49.75 50.0
* longitude (longitude) float64 115.0 115.2 115.5 ... 119.8 120.0
Data variables:
temperature_2m (date, latitude, longitude) float32 3.364 ... -1.089
relative_humidity_2m (date, latitude, longitude) float32 25.31 ... 64.49
dew_point_2m (date, latitude, longitude) float32 -14.69 ... -6.939
surface_pressure (date, latitude, longitude) float32 860.5 ... 924.1
wind_speed_10m (date, latitude, longitude) float32 16.08 ... 11.53
In [37]:
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from meteva.base.tool.plot_tools import add_china_map_2basemap
# 选择要绘制的时间步长(这里为第10个)
rh_data =ds.relative_humidity_2m[10]
# 创建一个新的 figure,并设置地图投影
fig, ax = plt.subplots(figsize=(10, 6), subplot_kw={"projection": ccrs.PlateCarree()})
# 绘制相对湿度数据
rh_data.plot(ax=ax, transform=ccrs.PlateCarree(), cmap='YlGnBu_r', vmin=0, vmax=100, zorder=3)
add_china_map_2basemap(ax, name="province", edgecolor='k', lw=0.5, encoding='gbk', grid0=None) # "
# 设置坐标轴标签和标题
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.set_title(f"Relative Humidity at 2m (Time Step {time_index})")
plt.show()
学习了以上方法,想必可以解大家获取ERA5数据的燃眉之急 还需注意的是大家不能一次性获取大量数据,不然会收到api限制的警告, open-meteo对每分钟内的请求次数进行了限制。