亚洲不卡一区综合视频,国产精品v日韩精品v欧美精品

如何使用Python分析姿態(tài)估計(jì)數(shù)據(jù)集COCO？

2021-01-16 09:50

＃遍歷所有圖像
for img＿id， img＿fname， w， h， meta in get＿meta（coco）：
images＿data．a(chǎn)ppend（｛
＇image＿id＇： int（img＿id），
＇path＇： img＿fname，
＇width＇： int（w），
＇height＇： int（h）
｝）

＃遍歷所有元數(shù)據(jù)
for m in meta：
persons＿data．a(chǎn)ppend（｛
＇image＿id＇： m［＇image＿id＇］，
＇is＿crowd＇： m［＇iscrowd＇］，
＇bbox＇： m［＇bbox＇］，
＇area＇： m［＇area＇］，
＇num＿keypoints＇： m［＇num＿keypoints＇］，
＇keypoints＇： m［＇keypoints＇］，
｝）

＃創(chuàng)建帶有圖像路徑的數(shù)據(jù)幀
images＿df ＝ pd．DataFrame（images＿data）
images＿df．set＿index（＇image＿id＇， inplace＝True）

＃創(chuàng)建與人相關(guān)的數(shù)據(jù)幀
persons＿df ＝ pd．DataFrame（persons＿data）
persons＿df．set＿index（＇image＿id＇， inplace＝True）
return images＿df， persons＿df
我們使用get＿meta函數(shù)構(gòu)造兩個(gè)數(shù)據(jù)幀—一個(gè)用于圖像路徑，另一個(gè)用于人的元數(shù)據(jù)。在一個(gè)圖像中可能有多個(gè)人，因此是一對(duì)多的關(guān)系。在下一步中，我們合并兩個(gè)表（left join操作）并將訓(xùn)練集和驗(yàn)證集組合，另外，我們添加了一個(gè)新列source，值為0表示訓(xùn)練集，值為1表示驗(yàn)證集。這樣的信息是必要的，因?yàn)槲覀冃枰缿?yīng)該在哪個(gè)文件夾中搜索圖像。如你所知，這些圖像位于兩個(gè)文件夾中：train2017／和val2017／images＿df， persons＿df ＝ convert＿to＿df（train＿coco）
train＿coco＿df ＝ pd．merge（images＿df， persons＿df， right＿index＝True， left＿index＝True）
train＿coco＿df［＇source＇］＝ 0
images＿df， persons＿df ＝ convert＿to＿df（val＿coco）
val＿coco＿df ＝ pd．merge（images＿df， persons＿df， right＿index＝True， left＿index＝True）
val＿coco＿df［＇source＇］＝ 1
coco＿df ＝ pd．concat（［train＿coco＿df， val＿coco＿df］， ignore＿index＝True）
最后，我們有一個(gè)表示整個(gè)COCO數(shù)據(jù)集的數(shù)據(jù)幀。圖像中有多少人現(xiàn)在我們可以執(zhí)行第一個(gè)分析。COCO數(shù)據(jù)集包含多個(gè)人的圖像，我們想知道有多少圖像只包含一個(gè)人。代碼如下：＃計(jì)數(shù)
annotated＿persons＿df ＝ coco＿df［coco＿df［＇is＿crowd＇］＝＝ 0］
crowd＿df ＝ coco＿df［coco＿df［＇is＿crowd＇］＝＝ 1］
print（＂Number of people in total：＂＋ str（len（annotated＿persons＿df）））
print（＂Number of crowd annotations：＂＋ str（len（crowd＿df）））
persons＿in＿img＿df ＝ pd．DataFrame（｛
＇cnt＇： annotated＿persons＿df［＇path＇］．value＿counts（）
｝）
persons＿in＿img＿df．reset＿index（level＝0， inplace＝True）
persons＿in＿img＿df．rename（columns ＝｛＇index＇：＇path＇｝， inplace ＝ True）
＃按cnt分組，這樣我們就可以在一張圖片中得到帶有注釋人數(shù)的數(shù)據(jù)幀
persons＿in＿img＿df ＝ persons＿in＿img＿df．groupby（［＇cnt＇］）．count（）
＃提取數(shù)組
x＿occurences ＝ persons＿in＿img＿df．index．values
y＿images ＝ persons＿in＿img＿df［＇path＇］．values
＃繪圖
plt．bar（x＿occurences， y＿images）
plt．title（＇People on a single image ＇）
plt．xticks（x＿occurences， x＿occurences）
plt．xlabel（＇Number of people in a single image＇）
plt．ylabel（＇Number of images＇）
plt．show（）
結(jié)果圖表：

如你所見(jiàn)，大多數(shù)COCO圖片都包含一個(gè)人。但也有相當(dāng)多的13個(gè)人的照片，讓我們舉幾個(gè)例子：

好吧，甚至有一張圖片有19個(gè)注解（非人群）：

這個(gè)圖像的頂部區(qū)域不應(yīng)該標(biāo)記為一個(gè)人群?jiǎn)幔渴堑模瑧?yīng)該，但是，我們有多個(gè)沒(méi)有關(guān)鍵點(diǎn)的邊界框！這樣的注釋?xiě)?yīng)該像對(duì)待人群一樣對(duì)待，這意味著它們應(yīng)該被屏蔽。在這張圖片中，只有中間的3個(gè)方框有一些關(guān)鍵點(diǎn)。讓我們來(lái)優(yōu)化查詢(xún)，以獲取包含有／沒(méi)有關(guān)鍵點(diǎn)的人圖像的統(tǒng)計(jì)信息，以及有／沒(méi)有關(guān)鍵點(diǎn)的人的總數(shù)：annotated＿persons＿nokp＿df ＝ coco＿df［（coco＿df［＇is＿crowd＇］＝＝ 0）＆（coco＿df［＇num＿keypoints＇］＝＝ 0）］
annotated＿persons＿kp＿df ＝ coco＿df［（coco＿df［＇is＿crowd＇］＝＝ 0）＆（coco＿df［＇num＿keypoints＇］＞ 0）］
print（＂Number of people （with keypoints） in total：＂＋
str（len（annotated＿persons＿kp＿df）））
print（＂Number of people without any keypoints in total：＂＋
str（len（annotated＿persons＿nokp＿df）））
persons＿in＿img＿kp＿df ＝ pd．DataFrame（｛
＇cnt＇： annotated＿persons＿kp＿df［［＇path＇，＇source＇］］．value＿counts（）
｝）
persons＿in＿img＿kp＿df．reset＿index（level＝［0，1］， inplace＝True）
persons＿in＿img＿cnt＿df ＝ persons＿in＿img＿kp＿df．groupby（［＇cnt＇］）．count（）
x＿occurences＿kp ＝ persons＿in＿img＿cnt＿df．index．values
y＿images＿kp ＝ persons＿in＿img＿cnt＿df［＇path＇］．values
f ＝ plt．figure（figsize＝（14， 8））
width ＝ 0．4
plt．bar（x＿occurences＿kp， y＿images＿kp， width＝width， label＝＇with keypoints＇）
plt．bar（x＿occurences ＋ width， y＿images， width＝width， label＝＇no keypoints＇）
plt．title（＇People on a single image ＇）
plt．xticks（x＿occurences ＋ width／2， x＿occurences）
plt．xlabel（＇Number of people in a single image＇）
plt．ylabel（＇Number of images＇）
plt．legend（loc ＝＇best＇）
plt．show（）
現(xiàn)在我們可以看到區(qū)別是明顯的。

雖然COCO官方頁(yè)面上描述有25萬(wàn)人擁有關(guān)鍵點(diǎn)，而我們只有156165個(gè)這樣的例子。他們可能應(yīng)該刪除了“帶關(guān)鍵點(diǎn)”這幾個(gè)字。添加額外列一旦我們將COCO轉(zhuǎn)換成pandas數(shù)據(jù)幀，我們就可以很容易地添加額外的列，從現(xiàn)有的列中計(jì)算出來(lái)。我認(rèn)為最好將所有的關(guān)鍵點(diǎn)坐標(biāo)提取到單獨(dú)的列中，此外，我們可以添加一個(gè)具有比例因子的列。特別是，關(guān)于一個(gè)人的邊界框的規(guī)模信息是非常有用的，例如，我們可能希望丟棄所有太小規(guī)模的人，或者執(zhí)行放大操作。為了實(shí)現(xiàn)這個(gè)目標(biāo)，我們使用Python庫(kù)sklearn中的transformer對(duì)象。一般來(lái)說(shuō)，sklearn transformers是用于清理、減少、擴(kuò)展和生成數(shù)據(jù)科學(xué)模型中的特征表示的強(qiáng)大工具。我們只會(huì)用一小部分的api。代碼如下：from sklearn．base import BaseEstimator， TransformerMixin
class AttributesAdder（BaseEstimator， TransformerMixin）：
def ＿＿init＿＿（self， num＿keypoints， w＿ix， h＿ix， bbox＿ix， kp＿ix）：
＂＂＂
：param num＿keypoints：關(guān)鍵點(diǎn)的數(shù)量
：param w＿ix：包含圖像寬度的列索引
：param h＿ix：包含圖像高度的列索引
：param bbox＿ix：包含邊框數(shù)據(jù)的列索引
：param kp＿ix：包含關(guān)鍵點(diǎn)數(shù)據(jù)的列索引
＂＂＂
self．num＿keypoints ＝ num＿keypoints
self．w＿ix ＝ w＿ix
self．h＿ix ＝ h＿ix
self．bbox＿ix ＝ bbox＿ix
self．kp＿ix ＝ kp＿ix

def fit（self， X， y＝None）：
return self

def transform（self， X）：

＃檢索特定列

w ＝ X［：， self．w＿ix］
h ＝ X［：， self．h＿ix］
bbox ＝ np．a(chǎn)rray（X［：， self．bbox＿ix］．tolist（））＃ to matrix
keypoints ＝ np．a(chǎn)rray（X［：， self．kp＿ix］．tolist（））＃ to matrix