4. RetinaNetΒΆ

Open In Colab

3μž₯μ—μ„œλŠ” 제곡된 데이터에 augmentation을 κ°€ν•˜λŠ” 방법과 데이터셋 클래슀λ₯Ό λ§Œλ“œλŠ” 방법을 ν™•μΈν–ˆμŠ΅λ‹ˆλ‹€. 이번 μž₯μ—μ„œλŠ” torchvisionμ—μ„œ μ œκ³΅ν•˜λŠ” one-stage λͺ¨λΈμΈ RetinaNet을 ν™œμš©ν•΄ 의료용 마슀크 κ²€μΆœ λͺ¨λΈμ„ κ΅¬μΆ•ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

4.1μ ˆλΆ€ν„° 4.3μ ˆκΉŒμ§€λŠ” 2μž₯κ³Ό 3μž₯μ—μ„œ ν™•μΈν•œ λ‚΄μš©μ„ λ°”νƒ•μœΌλ‘œ 데이터λ₯Ό 뢈러였고 ν›ˆλ ¨μš©, μ‹œν—˜μš© λ°μ΄ν„°λ‘œ λ‚˜λˆˆ ν›„ 데이터셋 클래슀λ₯Ό μ •μ˜ν•˜κ² μŠ΅λ‹ˆλ‹€. 4.4μ ˆμ—μ„œλŠ” torchvision APIλ₯Ό ν™œμš©ν•˜μ—¬ 사전 ν›ˆλ ¨λœ λͺ¨λΈμ„ λΆˆλŸ¬μ˜€κ² μŠ΅λ‹ˆλ‹€. 4.5μ ˆμ—μ„œλŠ” 전이 ν•™μŠ΅μ„ 톡해 λͺ¨λΈ ν•™μŠ΅μ„ μ§„ν–‰ν•œ ν›„ 4.6μ ˆμ—μ„œ μ˜ˆμΈ‘κ°’ μ‚°μΆœ 및 λͺ¨λΈ μ„±λŠ₯을 ν™•μΈν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

4.1 데이터 λ‹€μš΄λ‘œλ“œΒΆ

λͺ¨λΈλ§ μ‹€μŠ΅μ„ μœ„ν•΄ 2.1μ ˆμ— λ‚˜μ˜¨ μ½”λ“œλ₯Ό ν™œμš©ν•˜μ—¬ 데이터λ₯Ό λΆˆλŸ¬μ˜€κ² μŠ΅λ‹ˆλ‹€.

!git clone https://github.com/Pseudo-Lab/Tutorial-Book-Utils
!python Tutorial-Book-Utils/PL_data_loader.py --data FaceMaskDetection
!unzip -q Face\ Mask\ Detection.zip
Copy to clipboard
Cloning into 'Tutorial-Book-Utils'...
remote: Enumerating objects: 12, done.
remote: Counting objects: 100% (12/12), done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 12 (delta 1), reused 2 (delta 0), pack-reused 0
Unpacking objects: 100% (12/12), done.
Face Mask Detection.zip is done!
Copy to clipboard

4.2 데이터 뢄리¢

3.3μ ˆμ—μ„œ ν™•μΈν•œ 데이터 뢄리 방법을 ν™œμš©ν•˜μ—¬ 데이터λ₯Ό λΆ„λ¦¬ν•˜κ² μŠ΅λ‹ˆλ‹€.

import os
import random
import numpy as np
import shutil

print(len(os.listdir('annotations')))
print(len(os.listdir('images')))

!mkdir test_images
!mkdir test_annotations


random.seed(1234)
idx = random.sample(range(853), 170)

for img in np.array(sorted(os.listdir('images')))[idx]:
    shutil.move('images/'+img, 'test_images/'+img)

for annot in np.array(sorted(os.listdir('annotations')))[idx]:
    shutil.move('annotations/'+annot, 'test_annotations/'+annot)

print(len(os.listdir('annotations')))
print(len(os.listdir('images')))
print(len(os.listdir('test_annotations')))
print(len(os.listdir('test_images')))
Copy to clipboard
853
853
683
683
170
170
Copy to clipboard

4.3 데이터셋 클래슀 μ •μ˜ΒΆ

νŒŒμ΄ν† μΉ˜ λͺ¨λΈμ„ ν•™μŠ΅μ‹œν‚€κΈ° μœ„ν•΄μ„  데이터셋 클래슀λ₯Ό μ •μ˜ν•΄μ•Ό ν•©λ‹ˆλ‹€. torchvisionμ—μ„œ μ œκ³΅ν•˜λŠ” 객체 탐지 λͺ¨λΈμ„ ν•™μŠ΅μ‹œν‚€κΈ° μœ„ν•œ 데이터셋 클래슀의 __getitem__ λ©”μ„œλ“œλŠ” 이미지 파일과 λ°”μš΄λ”© λ°•μŠ€ μ’Œν‘œλ₯Ό λ°˜ν™˜ ν•©λ‹ˆλ‹€. 데이터셋 클래슀λ₯Ό 3μž₯μ—μ„œ ν™œμš©ν•œ μ½”λ“œλ₯Ό μ‘μš©ν•΄ μ•„λž˜μ™€ 같이 μ •μ˜ ν•˜κ² μŠ΅λ‹ˆλ‹€.

import os
import glob
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import matplotlib.patches as patches
from bs4 import BeautifulSoup
from PIL import Image
import cv2
import numpy as np
import time
import torch
import torchvision
from torch.utils.data import Dataset
from torchvision import transforms
from matplotlib import pyplot as plt
import os

def generate_box(obj):
    
    xmin = float(obj.find('xmin').text)
    ymin = float(obj.find('ymin').text)
    xmax = float(obj.find('xmax').text)
    ymax = float(obj.find('ymax').text)
    
    return [xmin, ymin, xmax, ymax]

def generate_label(obj):

    if obj.find('name').text == "with_mask":

        return 1

    elif obj.find('name').text == "mask_weared_incorrect":

        return 2

    return 0

def generate_target(file): 
    with open(file) as f:
        data = f.read()
        soup = BeautifulSoup(data, "html.parser")
        objects = soup.find_all("object")

        num_objs = len(objects)

        boxes = []
        labels = []
        for i in objects:
            boxes.append(generate_box(i))
            labels.append(generate_label(i))

        boxes = torch.as_tensor(boxes, dtype=torch.float32) 
        labels = torch.as_tensor(labels, dtype=torch.int64) 
        
        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        
        return target

def plot_image_from_output(img, annotation):
    
    img = img.cpu().permute(1,2,0)
    
    rects = []

    for idx in range(len(annotation["boxes"])):
        xmin, ymin, xmax, ymax = annotation["boxes"][idx]

        if annotation['labels'][idx] == 0 :
            rect = patches.Rectangle((xmin,ymin),(xmax-xmin),(ymax-ymin),linewidth=1,edgecolor='r',facecolor='none')
        
        elif annotation['labels'][idx] == 1 :
            
            rect = patches.Rectangle((xmin,ymin),(xmax-xmin),(ymax-ymin),linewidth=1,edgecolor='g',facecolor='none')
            
        else :
        
            rect = patches.Rectangle((xmin,ymin),(xmax-xmin),(ymax-ymin),linewidth=1,edgecolor='orange',facecolor='none')

        rects.append(rect)

    return img, rects

class MaskDataset(Dataset):
    def __init__(self, path, transform=None):
        self.path = path
        self.imgs = list(sorted(os.listdir(self.path)))
        self.transform = transform
        
    def __len__(self):
        return len(self.imgs)

    def __getitem__(self, idx):
        file_image = self.imgs[idx]
        file_label = self.imgs[idx][:-3] + 'xml'
        img_path = os.path.join(self.path, file_image)
        
        if 'test' in self.path:
            label_path = os.path.join("test_annotations/", file_label)
        else:
            label_path = os.path.join("annotations/", file_label)

        img = Image.open(img_path).convert("RGB")
        target = generate_target(label_path)
        
        to_tensor = torchvision.transforms.ToTensor()

        if self.transform:
            img, transform_target = self.transform(np.array(img), np.array(target['boxes']))
            target['boxes'] = torch.as_tensor(transform_target)

        # tensor둜 λ³€κ²½
        img = to_tensor(img)


        return img, target

def collate_fn(batch):
    return tuple(zip(*batch))

dataset = MaskDataset('images/')
test_dataset = MaskDataset('test_images/')

data_loader = torch.utils.data.DataLoader(dataset, batch_size=4, collate_fn=collate_fn)
test_data_loader = torch.utils.data.DataLoader(test_dataset, batch_size=2, collate_fn=collate_fn)
Copy to clipboard

μ΅œμ’…μ μœΌλ‘œ ν›ˆλ ¨μš© 데이터와 μ‹œν—˜μš© 데이터λ₯Ό batch λ‹¨μœ„λ‘œ 뢈러올 수 있게 torch.utils.data.DataLoader ν•¨μˆ˜λ₯Ό ν™œμš©ν•΄ data_loader와 test_data_loaderλ₯Ό 각각 μ •μ˜ν•©λ‹ˆλ‹€.

4.4 λͺ¨λΈ 뢈러였기¢

torchvisionμ—μ„œλŠ” 각쒅 컴퓨터 λΉ„μ „ 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•œ λ”₯λŸ¬λ‹ λͺ¨λΈμ„ μ‰½κ²Œ 뢈러올 수 μžˆλŠ” APIλ₯Ό μ œκ³΅ν•©λ‹ˆλ‹€. torchvision.models λͺ¨λ“ˆμ„ ν™œμš©ν•˜μ—¬ RetinaNet λͺ¨λΈμ„ λΆˆλŸ¬μ˜€λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€. RetinaNet은 torchvision 0.8.0 μ΄μƒμ—μ„œ μ œκ³΅λ˜λ―€λ‘œ, μ•„λž˜ μ½”λ“œλ₯Ό ν™œμš©ν•˜μ—¬ torchvision 버전을 λ§žμΆ°μ€λ‹ˆλ‹€.

!pip install torch==1.7.0+cu101 torchvision==0.8.1+cu101 torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
Copy to clipboard
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Requirement already satisfied: torch==1.7.0+cu101 in /usr/local/lib/python3.6/dist-packages (1.7.0+cu101)
Requirement already satisfied: torchvision==0.8.1+cu101 in /usr/local/lib/python3.6/dist-packages (0.8.1+cu101)
Collecting torchaudio==0.7.0
?25l  Downloading https://files.pythonhosted.org/packages/3f/23/6b54106b3de029d3f10cf8debc302491c17630357449c900d6209665b302/torchaudio-0.7.0-cp36-cp36m-manylinux1_x86_64.whl (7.6MB)
     |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 7.6MB 11.1MB/s 
?25hRequirement already satisfied: dataclasses in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) (0.8)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) (3.7.4.3)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) (1.18.5)
Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) (0.16.0)
Requirement already satisfied: pillow>=4.1.1 in /usr/local/lib/python3.6/dist-packages (from torchvision==0.8.1+cu101) (7.0.0)
Installing collected packages: torchaudio
Successfully installed torchaudio-0.7.0
Copy to clipboard
import torchvision
import torch
Copy to clipboard
torchvision.__version__
Copy to clipboard
'0.8.1+cu101'
Copy to clipboard

torchvision.__version__ λͺ…λ Ήμ–΄λ₯Ό 톡해 ν˜„μž¬ cuda 10.1 λ²„μ „μ—μ„œ μž‘λ™ν•˜λŠ” torchvision 0.8.1 버전이 μ„€μΉ˜ λμŒμ„ 확인할 수 μžˆμŠ΅λ‹ˆλ‹€. λ‹€μŒμœΌλ‘œλŠ” μ•„λž˜ μ½”λ“œλ₯Ό μ‹€ν–‰ν•˜μ—¬ RetinaNet λͺ¨λΈμ„ λΆˆλŸ¬μ˜΅λ‹ˆλ‹€. Face Mask Detection 데이터셋에 3개의 ν΄λž˜μŠ€κ°€ μ‘΄μž¬ν•˜λ―€λ‘œ num_classes λ§€κ°œλ³€μˆ˜λ₯Ό 3으둜 μ •μ˜ν•˜κ³ , 전이 ν•™μŠ΅μ„ ν•  것이기 λ•Œλ¬Έμ— backbone κ΅¬μ‘°λŠ” 사전 ν•™μŠ΅ 된 κ°€μ€‘μΉ˜λ₯Ό, κ·Έ μ™Έ κ°€μ€‘μΉ˜λŠ” μ΄ˆκΈ°ν™” μƒνƒœλ‘œ κ°€μ Έμ˜΅λ‹ˆλ‹€. backbone은 객체 탐지 λ°μ΄ν„°μ…‹μœΌλ‘œ 유λͺ…ν•œ COCO 데이터셋에 사전 ν•™μŠ΅ λμŠ΅λ‹ˆλ‹€.

retina = torchvision.models.detection.retinanet_resnet50_fpn(num_classes = 3, pretrained=False, pretrained_backbone = True)
Copy to clipboard

4.5 전이 ν•™μŠ΅ΒΆ

λͺ¨λΈμ„ λΆˆλŸ¬μ™”μœΌλ©΄ μ•„λž˜ μ½”λ“œλ₯Ό ν™œμš©ν•˜μ—¬ 전이 ν•™μŠ΅μ„ μ§„ν–‰ν•©λ‹ˆλ‹€.

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

num_epochs = 30
retina.to(device)
    
# parameters
params = [p for p in retina.parameters() if p.requires_grad] # gradient calculation이 ν•„μš”ν•œ params만 μΆ”μΆœ
optimizer = torch.optim.SGD(params, lr=0.005,
                                momentum=0.9, weight_decay=0.0005)

len_dataloader = len(data_loader)

# epoch λ‹Ή μ•½ 4λΆ„ μ†Œμš”
for epoch in range(num_epochs):
    start = time.time()
    retina.train()

    i = 0    
    epoch_loss = 0
    for images, targets in data_loader:
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = retina(images, targets) 

        losses = sum(loss for loss in loss_dict.values()) 

        i += 1

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()
        
        epoch_loss += losses 
    print(epoch_loss, f'time: {time.time() - start}')
Copy to clipboard
/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py:44: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
  warnings.warn(warning.format(ret))
Copy to clipboard
tensor(285.9670, device='cuda:0', grad_fn=<AddBackward0>) time: 242.22558188438416
tensor(268.1001, device='cuda:0', grad_fn=<AddBackward0>) time: 251.5482075214386
tensor(248.4554, device='cuda:0', grad_fn=<AddBackward0>) time: 248.92862486839294
tensor(233.0612, device='cuda:0', grad_fn=<AddBackward0>) time: 249.69438576698303
tensor(234.2285, device='cuda:0', grad_fn=<AddBackward0>) time: 247.88670659065247
tensor(202.4744, device='cuda:0', grad_fn=<AddBackward0>) time: 249.68517541885376
tensor(172.9739, device='cuda:0', grad_fn=<AddBackward0>) time: 250.47061586380005
tensor(125.8968, device='cuda:0', grad_fn=<AddBackward0>) time: 251.4771168231964
tensor(102.0443, device='cuda:0', grad_fn=<AddBackward0>) time: 251.20848298072815
tensor(88.1749, device='cuda:0', grad_fn=<AddBackward0>) time: 251.144877910614
tensor(78.1594, device='cuda:0', grad_fn=<AddBackward0>) time: 251.8066761493683
tensor(73.6921, device='cuda:0', grad_fn=<AddBackward0>) time: 251.669575214386
tensor(69.6965, device='cuda:0', grad_fn=<AddBackward0>) time: 251.8230264186859
tensor(63.9101, device='cuda:0', grad_fn=<AddBackward0>) time: 252.08272123336792
tensor(56.2955, device='cuda:0', grad_fn=<AddBackward0>) time: 252.18470931053162
tensor(56.2638, device='cuda:0', grad_fn=<AddBackward0>) time: 252.03237462043762
tensor(50.2047, device='cuda:0', grad_fn=<AddBackward0>) time: 252.09569120407104
tensor(45.9254, device='cuda:0', grad_fn=<AddBackward0>) time: 253.205641746521
tensor(44.4599, device='cuda:0', grad_fn=<AddBackward0>) time: 253.05651235580444
tensor(43.9277, device='cuda:0', grad_fn=<AddBackward0>) time: 253.1837260723114
tensor(40.4117, device='cuda:0', grad_fn=<AddBackward0>) time: 253.18618297576904
tensor(39.0882, device='cuda:0', grad_fn=<AddBackward0>) time: 253.36814761161804
tensor(35.3732, device='cuda:0', grad_fn=<AddBackward0>) time: 253.41503262519836
tensor(34.0460, device='cuda:0', grad_fn=<AddBackward0>) time: 252.93738174438477
tensor(35.8844, device='cuda:0', grad_fn=<AddBackward0>) time: 253.25822925567627
tensor(33.1177, device='cuda:0', grad_fn=<AddBackward0>) time: 253.25469851493835
tensor(28.4753, device='cuda:0', grad_fn=<AddBackward0>) time: 253.2648823261261
tensor(30.3831, device='cuda:0', grad_fn=<AddBackward0>) time: 253.4244725704193
tensor(28.0954, device='cuda:0', grad_fn=<AddBackward0>) time: 253.57142424583435
tensor(28.5899, device='cuda:0', grad_fn=<AddBackward0>) time: 253.16517424583435
Copy to clipboard

λͺ¨λΈ μž¬μ‚¬μš©μ„ μœ„ν•΄ μ•„λž˜ μ½”λ“œλ₯Ό μ‹€ν–‰ν•˜μ—¬ ν•™μŠ΅λœ κ°€μ€‘μΉ˜λ₯Ό μ €μž₯ν•΄μ€λ‹ˆλ‹€. torch.save ν•¨μˆ˜λ₯Ό ν™œμš©ν•΄ μ§€μ •ν•œ μœ„μΉ˜μ— ν•™μŠ΅λœ κ°€μ€‘μΉ˜λ₯Ό μ €μž₯ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

torch.save(retina.state_dict(),f'retina_{num_epochs}.pt')
Copy to clipboard
retina.load_state_dict(torch.load(f'retina_{num_epochs}.pt'))
Copy to clipboard
<All keys matched successfully>
Copy to clipboard

ν•™μŠ΅λœ κ°€μ€‘μΉ˜λ₯Ό 뢈러올 λ•ŒλŠ” load_state_dictκ³Ό torch.loadν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜λ©΄ λ©λ‹ˆλ‹€. λ§Œμ•½ retina λ³€μˆ˜λ₯Ό μƒˆλ‘­κ²Œ μ§€μ •ν–ˆμ„ 경우, ν•΄λ‹Ή λͺ¨λΈμ„ GPU λ©”λͺ¨λ¦¬μ— μ˜¬λ €μ£Όμ–΄μ•Ό GPU 연산이 κ°€λŠ₯ν•©λ‹ˆλ‹€.

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
retina.to(device)
Copy to clipboard

4.6 예츑¢

ν›ˆλ ¨μ΄ 마무리 λ˜μ—ˆμœΌλ©΄, 예츑 κ²°κ³Όλ₯Ό ν™•μΈν•˜λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€. test_data_loaderμ—μ„œ 데이터λ₯Ό λΆˆλŸ¬μ™€ λͺ¨λΈμ— λ„£μ–΄ ν•™μŠ΅ ν›„, 예츑된 결과와 μ‹€μ œ 값을 각각 μ‹œκ°ν™” 해보도둝 ν•˜κ² μŠ΅λ‹ˆλ‹€. μš°μ„  μ˜ˆμΈ‘μ— ν•„μš”ν•œ ν•¨μˆ˜λ₯Ό μ •μ˜ν•˜κ² μŠ΅λ‹ˆλ‹€.

def make_prediction(model, img, threshold):
    model.eval()
    preds = model(img)
    for id in range(len(preds)) :
        idx_list = []

        for idx, score in enumerate(preds[id]['scores']) :
            if score > threshold : #threshold λ„˜λŠ” idx ꡬ함
                idx_list.append(idx)

        preds[id]['boxes'] = preds[id]['boxes'][idx_list]
        preds[id]['labels'] = preds[id]['labels'][idx_list]
        preds[id]['scores'] = preds[id]['scores'][idx_list]


    return preds
Copy to clipboard

make_prediction ν•¨μˆ˜μ—λŠ” ν•™μŠ΅λœ λ”₯λŸ¬λ‹ λͺ¨λΈμ„ ν™œμš©ν•΄ μ˜ˆμΈ‘ν•˜λŠ” μ•Œκ³ λ¦¬μ¦˜μ΄ μ €μž₯돼 μžˆμŠ΅λ‹ˆλ‹€. threshold νŒŒλΌλ―Έν„°λ₯Ό μ‘°μ •ν•΄ 신뒰도가 일정 μˆ˜μ€€ μ΄μƒμ˜ λ°”μš΄λ”© λ°•μŠ€λ§Œ μ„ νƒν•©λ‹ˆλ‹€. 보톡 0.5 이상인 값을 μ΅œμ’… μ„ νƒν•©λ‹ˆλ‹€. λ‹€μŒμœΌλ‘œλŠ” for문을 ν™œμš©ν•΄ test_data_loader에 μžˆλŠ” λͺ¨λ“  데이터에 λŒ€ν•΄ μ˜ˆμΈ‘μ„ μ‹€μ‹œν•˜κ² μŠ΅λ‹ˆλ‹€.

from tqdm import tqdm

labels = []
preds_adj_all = []
annot_all = []

for im, annot in tqdm(test_data_loader, position = 0, leave = True):
    im = list(img.to(device) for img in im)
    #annot = [{k: v.to(device) for k, v in t.items()} for t in annot]

    for t in annot:
        labels += t['labels']

    with torch.no_grad():
        preds_adj = make_prediction(retina, im, 0.5)
        preds_adj = [{k: v.to(torch.device('cpu')) for k, v in t.items()} for t in preds_adj]
        preds_adj_all.append(preds_adj)
        annot_all.append(annot)
Copy to clipboard
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 85/85 [00:24<00:00,  3.47it/s]
Copy to clipboard

tqdm ν•¨μˆ˜λ₯Ό ν™œμš©ν•΄ 진행 상황을 ν‘œμΈŒν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. 예츑된 λͺ¨λ“  값은 preds_adj_all λ³€μˆ˜μ— μ €μž₯λμŠ΅λ‹ˆλ‹€. λ‹€μŒμœΌλ‘œλŠ” μ‹€μ œ λ°”μš΄λ”© λ°•μŠ€μ™€ μ˜ˆμΈ‘ν•œ λ°”μš΄λ”© λ°•μŠ€μ— λŒ€ν•œ μ‹œκ°ν™”λ₯Ό μ§„ν–‰ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

nrows = 8
ncols = 2
fig, axes = plt.subplots(nrows=nrows, ncols=ncols, figsize=(ncols*4, nrows*4))

batch_i = 0
for im, annot in test_data_loader:
    pos = batch_i * 4 + 1
    for sample_i in range(len(im)) :
        
        img, rects = plot_image_from_output(im[sample_i], annot[sample_i])
        axes[(pos)//2, 1-((pos)%2)].imshow(img)
        for rect in rects:
            axes[(pos)//2, 1-((pos)%2)].add_patch(rect)
        
        img, rects = plot_image_from_output(im[sample_i], preds_adj_all[batch_i][sample_i])
        axes[(pos)//2, 1-((pos+1)%2)].imshow(img)
        for rect in rects:
            axes[(pos)//2, 1-((pos+1)%2)].add_patch(rect)

        pos += 2

    batch_i += 1
    if batch_i == 4:
        break

# xtick, ytick 제거
for idx, ax in enumerate(axes.flat):
    ax.set_xticks([])
    ax.set_yticks([])

colnames = ['True', 'Pred']

for idx, ax in enumerate(axes[0]):
    ax.set_title(colnames[idx])

plt.tight_layout()
plt.show()
Copy to clipboard
../../_images/Ch4-RetinaNet_35_0.png

for문을 ν™œμš©ν•΄ 4개의 batch, 총 8개의 이미지에 λŒ€ν•œ μ‹€μ œ κ°’κ³Ό 예츑 값을 μ‹œκ°ν•΄ λ³΄μ•˜μŠ΅λ‹ˆλ‹€. μ™Όμͺ½ 열이 μ‹€μ œ λ°”μš΄λ”© λ°•μŠ€μ˜ 라벨과 μœ„μΉ˜μ΄λ©° 였λ₯Έμͺ½ 열이 λͺ¨λΈμ˜ 예츑 κ°’μž…λ‹ˆλ‹€. 마슀크 착용자(μ΄ˆλ‘μƒ‰)λŠ” 잘 νƒμ§€ν•˜κ³  μžˆλŠ” 것을 κ΄€μΈ‘ν•˜κ³  있으며, 마슀크 미착용자(빨간색)에 λŒ€ν•΄μ„œλŠ” 가끔씩 마슀크λ₯Ό μ˜¬λ°”λ₯΄μ§€ μ•Šκ²Œ μ°©μš©ν•œ 것(주황색)으둜 νƒμ§€ν•œ 것을 λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€. μ „λ°˜μ μΈ λͺ¨λΈ μ„±λŠ₯을 ν‰κ°€ν•˜κΈ° μœ„ν•΄ mean Average Precision (mAP)λ₯Ό μ‚°μΆœν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. mAPλŠ” 객체 탐지 λͺ¨λΈμ„ 평가할 λ•Œ μ‚¬μš©ν•˜λŠ” μ§€ν‘œμž…λ‹ˆλ‹€.

데이터 λ‹€μš΄λ‘œλ“œμ‹œ λΆˆλŸ¬μ™”λ˜ Tutorial-Book-Utils 폴더 λ‚΄μ—λŠ” utils_ObjectDetection.py 파일이 μžˆμŠ΅λ‹ˆλ‹€. ν•΄λ‹Ή λͺ¨λ“ˆ 내에 μžˆλŠ” ν•¨μˆ˜λ₯Ό ν™œμš©ν•΄ mAPλ₯Ό μ‚°μΆœν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. μš°μ„  utils_ObjectDetection.py λͺ¨λ“ˆμ„ λΆˆλŸ¬μ˜΅λ‹ˆλ‹€.

%cd Tutorial-Book-Utils/
import utils_ObjectDetection as utils
Copy to clipboard
/content/Tutorial-Book-Utils
Copy to clipboard
sample_metrics = []
for batch_i in range(len(preds_adj_all)):
    sample_metrics += utils.get_batch_statistics(preds_adj_all[batch_i], annot_all[batch_i], iou_threshold=0.5) 
Copy to clipboard

batch 별 mAPλ₯Ό μ‚°μΆœν•˜λŠ”λ° ν•„μš”ν•œ 정보λ₯Ό sample_metrics에 μ €μž₯ ν›„ ap_per_classν•¨μˆ˜λ₯Ό ν™œμš©ν•΄ mAPλ₯Ό μ‚°μΆœν•©λ‹ˆλ‹€.

true_positives, pred_scores, pred_labels = [torch.cat(x, 0) for x in list(zip(*sample_metrics))]  # λ°°μΉ˜κ°€ μ „λΆ€ 합쳐짐
precision, recall, AP, f1, ap_class = utils.ap_per_class(true_positives, pred_scores, pred_labels, torch.tensor(labels))
mAP = torch.mean(AP)
print(f'mAP : {mAP}')
print(f'AP : {AP}')
Copy to clipboard
mAP : 0.5824690281035101
AP : tensor([0.7684, 0.9188, 0.0603], dtype=torch.float64)
Copy to clipboard

κ²°κ³Όλ₯Ό ν•΄μ„ν•˜λ©΄ 0번 클래슀인 마슀크λ₯Ό λ―Έμ°©μš©ν•œ 객체에 λŒ€ν•΄μ„œλŠ” 0.7684 APλ₯Ό 보이며 1번 클래슀인 마슀크 착용 객체에 λŒ€ν•΄μ„œλŠ” 0.9188 APλ₯Ό 보이고, 2번 클래슀인 마슀크λ₯Ό μ˜¬λ°”λ₯΄κ²Œ μ°©μš©ν•˜μ§€ μ•Šμ€ 객체에 λŒ€ν•΄μ„œλŠ” 0.06 APλ₯Ό λ³΄μž…λ‹ˆλ‹€.

μ§€κΈˆκΉŒμ§€ RetinaNet에 λŒ€ν•œ 전이 ν•™μŠ΅μ„ μ‹€μ‹œν•΄ 의료용 마슀크 탐지 λͺ¨λΈμ„ λ§Œλ“€μ–΄ λ³΄μ•˜μŠ΅λ‹ˆλ‹€. λ‹€μŒ μž₯μ—μ„œλŠ” Two-Stage Detector인 Faster R-CNN을 ν™œμš©ν•΄ 탐지 μ„±λŠ₯을 λ†’μ—¬λ³΄κ² μŠ΅λ‹ˆλ‹€.