python提取列表中的数字,使用python从字符串列表中提取数字

I have a list of strings that I am trying to parse for data that is meaningful to me. I need an ID number that is contained within the string. Sometimes it might be two or even three of them. Example string might be:

lst1 = [

"(Tower 3rd floor window corner_ : option 3_floor cut out_large : GA - floors : : model lines : id 3999595(tower 4rd floor window corner : option 3_floor: : whatever else is in iit " new floor : id 3999999)",

"(Tower 3rd floor window corner_ : option 3_floor cut out_large : GA - floors : : model lines : id 3998895(tower 4rd floor window corner : option 3_floor: : id 5555456 whatever else is in iit " new floor : id 3998899)"

]

I would like to be able to iterate over that list of strings and extract only those highlighted id values.

Output would be a lst1 = ["3999595; 3999999", "3998895; 5555456; 3998899"] where each id values from the same input string is separated by a colon but list order still matches the input list.

解决方案

You can use id\s(\d{7}) regular expression.

Iterate over items in a list and join the results of findall() call by ;:

import re

lst1 = [

'(Tower 3rd floor window corner_ : option 3_floor cut out_large : GA - floors : : model lines : id 3999595(tower 4rd floor window corner : option 3_floor: : whatever else is in iit " new floor : id 3999999)',

'(Tower 3rd floor window corner_ : option 3_floor cut out_large : GA - floors : : model lines : id 3998895(tower 4rd floor window corner : option 3_floor: : id 5555456 whatever else is in iit " new floor : id 3998899)'

]

pattern = re.compile(r'id\s(\d{7})')

print ["; ".join(pattern.findall(item)) for item in lst1]

prints:

['3999595; 3999999', '3998895; 5555456; 3998899']