Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

programing

Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

newsource 2023. 9. 17. 13:16

Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

제 데이터 프레임 내의 특정 열에 특정 문자열이 존재하는지 확인하고 싶습니다.

오류가 납니다.

값 오류:시리즈의 참값은 모호합니다.a.empty, a.bool(), a.item(), a.any() 또는 a.all()을 사용합니다.

import pandas as pd

BabyDataSet = [('Bob', 968), ('Jessica', 155), ('Mary', 77), ('John', 578), ('Mel', 973)]

a = pd.DataFrame(data=BabyDataSet, columns=['Names', 'Births'])

if a['Names'].str.contains('Mel'):
    print ("Mel is there")

a['Names'].str.contains('Mel')크기의 부울 값의 지시 벡터를 반환합니다.len(BabyDataSet)

따라서 사용할 수 있습니다.

mel_count=a['Names'].str.contains('Mel').sum()
if mel_count>0:
    print ("There are {m} Mels".format(m=mel_count))

아니면any(), 몇 개의 레코드가 당신의 쿼리와 일치하는지 신경쓰지 않는다면

if a['Names'].str.contains('Mel').any():
    print ("Mel is there")

당신은 사용해야 합니다.any()

In [98]: a['Names'].str.contains('Mel').any()
Out[98]: True

In [99]: if a['Names'].str.contains('Mel').any():
   ....:     print("Mel is there")
   ....:
Mel is there

a['Names'].str.contains('Mel')일련의 벌 값을 제공합니다.

In [100]: a['Names'].str.contains('Mel')
Out[100]:
0    False
1    False
2    False
3    False
4     True
Name: Names, dtype: bool

OP는 'Mel' 문자열이 특정 열에 존재하는지, 열에 포함되지 않는지를 알아보기 위한 것이었습니다.따라서 내용물을 사용할 필요도 없고 효율적이지도 않습니다.

다음과 같은 단순한 동등함으로 충분합니다.

df = pd.DataFrame({"names": ["Melvin", "Mel", "Me", "Mel", "A.Mel"]})

mel_count = (df['names'] == 'Mel').sum() 
print("There are {num} instances of 'Mel'. ".format(num=mel_count)) 
 
mel_exists = (df['names'] == 'Mel').any() 
print("'Mel' exists in the dataframe.".format(num=mel_exists)) 

mel_exists2 = 'Mel' in df['names'].values 
print("'Mel' is in the dataframe: " + str(mel_exists2))

인쇄:

There are 2 instances of 'Mel'. 
'Mel' exists in the dataframe.
'Mel' is in the dataframe: True

같은 문제에 부딪혀 다음과 같이 사용했습니다.

if "Mel" in a["Names"].values:
    print("Yep")

하지만 내부적으로 팬더가 시리즈에서 목록을 만들기 때문에 이 해결책은 더 느려질 수 있습니다.

빈 문자열을 검색해야 할 가능성이 있다면,

    a['Names'].str.contains('')

항상 True를 반환하므로 작동하지 않습니다.

대신 사용

    if '' in a["Names"].values

빈 문자열을 검색하는 에지 케이스를 포함하여 문자열이 영상 시리즈에 있는지 여부를 정확하게 반영합니다.

대소문자를 구분하지 않는 검색의 경우.

a['Names'].str.lower().str.contains('mel').any()

팬더들이 추천하는 것 같습니다.df.to_numpy since다른 방법들은 여전히 A를 제기합니다.FutureWarning: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_numpy.html#pandas.DataFrame.to_numpy

따라서 이 경우에 효과적인 대안은 다음과 같습니다.

b=a['Names']
c = b.to_numpy().tolist()
if 'Mel' in c:
     print("Mel is in the dataframe column Names")

import re
s = 'string'

df['Name'] = df['Name'].str.findall(s, flags = re.IGNORECASE)

#or
df['Name'] = df[df['Name'].isin(['string1', 'string2'])]

import pandas as pd

(data_frame.col_name=='str_name_to_check').sum()

결과를 저장하려면 다음을 사용할 수 있습니다.

a['result'] = a['Names'].apply(lambda x : ','.join([item for item in str(x).split() if item.lower() in ['mel', 'etc']]))

당신은 당신의 코드 라인의 값을 체크 길이를 더하는 것과 같이 확인해야 합니다.

if(len(a['Names'].str.contains('Mel'))>0):
    print("Name Present")

언급URL : https://stackoverflow.com/questions/30944577/check-if-string-is-in-a-pandas-dataframe

'programing' 카테고리의 다른 글

일부 다른 값을 기반으로 하는 Mysql RAND() 함수 (0)	2023.09.27
MYSQL Left Join NULL 값을 선택하려면 어떻게 해야 합니까? (0)	2023.09.17
XML 명령줄 처리를 위한 Grep and Sed equivalent (0)	2023.09.17
n개 이상의 기준 중 n-1개 이상이 일치한 모든 레코드 (0)	2023.09.17
데이터베이스에 없는 경우 행 삽입 (0)	2023.09.17

현재글Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

각종 프로그래밍 정보를 다루는 블로그입니다.

asp.net, php, C, Oracle, json, vuex, mysql, Git, Excel, angularJS, vuejs2, MongoDB, Spring-boot, python, MariaDB, reactjs, jQuery, Java, javascript, wordpress,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

newsource

Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

'programing' 카테고리의 다른 글

'programing'의 다른글

티스토리툴바

Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

Pandas 데이터 프레임에 문자열이 있는지 확인합니다.

'programing' 카테고리의 다른 글

'programing'의 다른글

관련글

티스토리툴바