优化这段代码private static final Pattern P_QUOTED_ATTRIBUTES = Pattern.compile("([a-z0-9]+)=([\"'])(.*?)\\2", Pattern.CASE_INSENSITIVE | Pattern.DOTALL)
时间: 2024-02-03 12:12:07 浏览: 33
这段代码是用于匹配 HTML 标签中的属性和属性值,其中正则表达式 `([a-z0-9]+)=([\"'])(.*?)\\2` 匹配了一个属性和它的属性值。这个正则表达式可以进一步优化,具体有以下几个方面:
1. 去掉不必要的括号。正则表达式中的括号会生成捕获组,而如果不需要这些捕获组,可以将括号去掉以提高效率。因此,可以将 `([a-z0-9]+)` 和 `(.*?)` 中的括号去掉。
2. 使用单引号代替双引号。正则表达式中的双引号需要转义,而单引号则不需要,因此可以将 `([\"'])` 改为 `(')`。
3. 将属性名中的字符集改为更精确的字符集。属性名中可以包含字母、数字和减号,因此可以将 `[a-z0-9]` 改为 `[a-zA-Z0-9-]`。
综上所述,可以将原来的正则表达式改为 `([a-zA-Z0-9-]+)='(.*?)'`,代码如下:
```java
private static final Pattern P_QUOTED_ATTRIBUTES = Pattern.compile("([a-zA-Z0-9-]+)='(.*?)'", Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
```
相关问题
import pandas as pd import pymysql # 连接到数据库 conn = pymysql.connect(host='localhost', user='user', password='password', database='database') # 获取所有表格的名称 cursor = conn.cursor() cursor.execute("SHOW TABLES") tables = cursor.fetchall() # 遍历所有表格 for table in tables: table_name = table[0] table_name_quoted = '' + table_name + '' # 检查是否存在名为'a'的列,如果不存在则添加'a'和'b'列 cursor.execute("SHOW COLUMNS FROM " + table_name_quoted + " LIKE 'a'") a_column = cursor.fetchone() if a_column is None: cursor.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN a DECIMAL(10,2)") cursor.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN b DECIMAL(10,2)") conn.commit() # 查询net_mf_amount列的数据 query = "SELECT trade_date, net_mf_amount FROM " + table_name_quoted + " ORDER BY trade_date DESC" df = pd.read_sql_query(query, conn) # 计算a和b列 a_column = [] b_column = [] for i in range(len(df)): if i == 0: a_column.append(None) b_column.append(None) else: if pd.notnull(df.iloc[i]['net_mf_amount']) and pd.notnull(df.iloc[i-1]['net_mf_amount']): if i-2 >= 0: if pd.notnull(df.iloc[i-2]['net_mf_amount']): a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount'] b = df.iloc[i]['net_mf_amount'] - df.iloc[i-2]['net_mf_amount'] a_column.append(a) b_column.append(b) else: j = i-3 while j >= 0: if pd.notnull(df.iloc[j]['net_mf_amount']): a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount'] b = df.iloc[i]['net_mf_amount'] - df.iloc[j]['net_mf_amount'] a_column.append(a) b_column.append(b) break j -= 1 else: a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount'] b = None a_column.append(a) b_column.append(b) else: a_column.append(None) b_column.append(None) # 将结果保存到数据库 for i in range(len(df)): cursor.execute("UPDATE " + table_name_quoted + " SET a=%s, b=%s WHERE trade_date=%s", (a_column[i], b_column[i], df.iloc[i]['trade_date'])) conn.commit() # 关闭连接 cursor.close() conn.close() 报错/Users/ljy/stock/daily_analyse.py:65: UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy. df = pd.read_sql_query(query, conn)
这段代码的报错和之前的一样,是因为使用的连接对象 conn 不是 SQLAlchemy 连接对象或者 SQLite3 的 DBAPI2 连接对象。你可以将该连接对象转换为 SQLAlchemy 连接对象或者使用数据库字符串 URI 来连接数据库。将连接数据库的代码修改如下:
```
import pandas as pd
from sqlalchemy import create_engine
# 连接到数据库
engine = create_engine('mysql+pymysql://user:password@localhost/database')
# 获取所有表格的名称
with engine.connect() as conn, conn.begin():
tables = conn.execute("SHOW TABLES").fetchall()
# 遍历所有表格
for table in tables:
table_name = table[0]
table_name_quoted = '`' + table_name + '`'
# 检查是否存在名为'a'的列,如果不存在则添加'a'和'b'列
with engine.connect() as conn, conn.begin():
a_column = conn.execute("SHOW COLUMNS FROM " + table_name_quoted + " LIKE 'a'").fetchone()
if a_column is None:
conn.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN a DECIMAL(10,2)")
conn.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN b DECIMAL(10,2)")
# 查询net_mf_amount列的数据
query = "SELECT trade_date, net_mf_amount FROM " + table_name_quoted + " ORDER BY trade_date DESC"
df = pd.read_sql_query(query, engine)
# 计算a和b列
a_column = []
b_column = []
for i in range(len(df)):
if i == 0:
a_column.append(None)
b_column.append(None)
else:
if pd.notnull(df.iloc[i]['net_mf_amount']) and pd.notnull(df.iloc[i-1]['net_mf_amount']):
if i-2 >= 0:
if pd.notnull(df.iloc[i-2]['net_mf_amount']):
a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount']
b = df.iloc[i]['net_mf_amount'] - df.iloc[i-2]['net_mf_amount']
a_column.append(a)
b_column.append(b)
else:
j = i-3
while j >= 0:
if pd.notnull(df.iloc[j]['net_mf_amount']):
a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount']
b = df.iloc[i]['net_mf_amount'] - df.iloc[j]['net_mf_amount']
a_column.append(a)
b_column.append(b)
break
j -= 1
else:
a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount']
b = None
a_column.append(a)
b_column.append(b)
else:
a_column.append(None)
b_column.append(None)
# 将结果保存到数据库
with engine.connect() as conn, conn.begin():
for i in range(len(df)):
conn.execute("UPDATE " + table_name_quoted + " SET a=%s, b=%s WHERE trade_date=%s", (a_column[i], b_column[i], df.iloc[i]['trade_date']))
# 关闭连接
engine.dispose()
```
注意:上述代码中的 user, password 和 database 分别对应的是你自己的用户名、密码和数据库名,需要进行修改。
import pandas as pd from sqlalchemy import create_engine # 连接到数据库 engine = create_engine('mysql+pymysql://user:password@localhost/database') # 获取所有表格的名称 with engine.connect() as conn, conn.begin(): tables = conn.execute("SHOW TABLES").fetchall() # 遍历所有表格 for table in tables: table_name = table[0] table_name_quoted = '' + table_name + '' # 检查是否存在名为'a'的列,如果不存在则添加'a'和'b'列 with engine.connect() as conn, conn.begin(): a_column = conn.execute("SHOW COLUMNS FROM " + table_name_quoted + " LIKE 'a'").fetchone() if a_column is None: conn.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN a DECIMAL(10,2)") conn.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN b DECIMAL(10,2)") # 查询net_mf_amount列的数据 query = "SELECT trade_date, net_mf_amount FROM " + table_name_quoted + " ORDER BY trade_date DESC" df = pd.read_sql_query(query, engine) # 计算a和b列 a_column = [] b_column = [] for i in range(len(df)): if i == 0: a_column.append(None) b_column.append(None) else: if pd.notnull(df.iloc[i]['net_mf_amount']) and pd.notnull(df.iloc[i-1]['net_mf_amount']): if i-2 >= 0: if pd.notnull(df.iloc[i-2]['net_mf_amount']): a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount'] b = df.iloc[i]['net_mf_amount'] - df.iloc[i-2]['net_mf_amount'] a_column.append(a) b_column.append(b) else: j = i-3 while j >= 0: if pd.notnull(df.iloc[j]['net_mf_amount']): a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount'] b = df.iloc[i]['net_mf_amount'] - df.iloc[j]['net_mf_amount'] a_column.append(a) b_column.append(b) break j -= 1 else: a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount'] b = None a_column.append(a) b_column.append(b) else: a_column.append(None) b_column.append(None) # 将结果保存到数据库 with engine.connect() as conn, conn.begin(): for i in range(len(df)): conn.execute("UPDATE " + table_name_quoted + " SET a=%s, b=%s WHERE trade_date=%s", (a_column[i], b_column[i], df.iloc[i]['trade_date'])) # 关闭连接 engine.dispose() 有5000个表格,使用多线程,线程池
可以使用Python内置的`concurrent.futures`模块实现多线程处理。具体实现可以参考以下代码:
```
import concurrent.futures
def process_table(table_name):
table_name_quoted = '' + table_name + ''
with engine.connect() as conn, conn.begin():
a_column = conn.execute("SHOW COLUMNS FROM " + table_name_quoted + " LIKE 'a'").fetchone()
if a_column is None:
conn.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN a DECIMAL(10,2)")
conn.execute("ALTER TABLE " + table_name_quoted + " ADD COLUMN b DECIMAL(10,2)")
query = "SELECT trade_date, net_mf_amount FROM " + table_name_quoted + " ORDER BY trade_date DESC"
df = pd.read_sql_query(query, engine)
a_column = []
b_column = []
for i in range(len(df)):
if i == 0:
a_column.append(None)
b_column.append(None)
else:
if pd.notnull(df.iloc[i]['net_mf_amount']) and pd.notnull(df.iloc[i-1]['net_mf_amount']):
if i-2 >= 0:
if pd.notnull(df.iloc[i-2]['net_mf_amount']):
a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount']
b = df.iloc[i]['net_mf_amount'] - df.iloc[i-2]['net_mf_amount']
a_column.append(a)
b_column.append(b)
else:
j = i-3
while j >= 0:
if pd.notnull(df.iloc[j]['net_mf_amount']):
a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount']
b = df.iloc[i]['net_mf_amount'] - df.iloc[j]['net_mf_amount']
a_column.append(a)
b_column.append(b)
break
j -= 1
else:
a = df.iloc[i]['net_mf_amount'] - df.iloc[i-1]['net_mf_amount']
b = None
a_column.append(a)
b_column.append(b)
else:
a_column.append(None)
b_column.append(None)
with engine.connect() as conn, conn.begin():
for i in range(len(df)):
conn.execute("UPDATE " + table_name_quoted + " SET a=%s, b=%s WHERE trade_date=%s", (a_column[i], b_column[i], df.iloc[i]['trade_date']))
print("Processed table:", table_name)
if __name__ == '__main__':
engine = create_engine('mysql+pymysql://user:password@localhost/database')
with engine.connect() as conn, conn.begin():
tables = conn.execute("SHOW TABLES").fetchall()
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
executor.map(process_table, [table[0] for table in tables])
engine.dispose()
```
在`main`函数中,首先获取所有表格的名称,然后使用`ThreadPoolExecutor`创建一个最大线程数为10的线程池。使用`map`方法将`process_table`函数和表格名称列表传入,线程池会自动分配任务并执行。每个线程都会执行`process_table`函数,对一张表格进行处理。
在`process_table`函数中,首先检查是否存在名为'a'的列,如果不存在则添加'a'和'b'列。然后从数据库中查询数据到`DataFrame`中,并对每一行数据计算'a'和'b'列的值。最后将结果更新到数据库中。每个线程在处理完一张表格后,会输出一条信息,表示该表格处理完成。
相关推荐
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)