0

0

百度网盘AI大赛-水印智能消除赛第19名方案

P粉084495128

P粉084495128

发布时间:2025-07-30 10:42:40

|

201人浏览过

|

来源于php中文网

原创

该项目针对百度网盘AI大赛水印智能消除赛,基于UNet改进模型:用leaky_relu保留信息,设双分支注意力通路增强容量,加残差连接加速收敛。处理数据并划分训练验证集,以PSNR和SSIM损失训练,A榜分数有所提升,最后生成提交文件及预测结果。

☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜

百度网盘ai大赛-水印智能消除赛第19名方案 - php中文网

百度网盘AI大赛-图像处理挑战赛:水印智能消除赛

本项目基于Baseline使用略作修改的UNet消除图像中水印以完成百度网盘AI大赛-图像处理挑战赛:水印智能消除赛。

比赛链接

一、比赛介绍

日常生活中带有水印的图片很常见,即使是PS专家,也很难快速且不留痕迹的去除水印。而使用智能去除水印的算法,可以快速自动去除图片中的水印。选手需要通过深度学习技术训练模型,对给定的真实场景下采集得到的带有水印的图片进行处理,并最终输出处理后的扫描结果图片。

本次比赛希望选手结合当下前沿的图像处理技术与计算机视觉技术,提升模型的训练性能和泛化能力,在保证效果精准的同时,注意模型在实际应用中的性能问题,做到尽可能的小而快。

评价标准

  • 评价指标为 PSNR 和 MSSSIM;
  • 用于评价的机器环境仅提供两种框架模型运行环境:paddlepaddle 和 onnxruntime,其他框架模型可转换为上述两种框架的模型;
  • 机器配置:V100,显存15G,内存10G;
  • 单张图片耗时>1.2s,决赛中的性能分数记0分。

因此,应尽可能不能使用过大的模型。

二、方法介绍

在Baseline基础上,本项目继续使用UNet网络,对水印图像进行像素级转换。相比较于Baseline,我们做了四处修改:第一、我们选择使用leaky_relu,而不是relu,以尽可能的保留像素信息;第二、在Encoder与Decoder中间的过渡层,我们不是使用原来的单分支,而是以注意力为权重,建立双分支通路,将自适应加权结果变换后输入到Decoder中,增强模型容量;第三、同样地,水印输出,我们选择双分支输出,以注意力值为权重,增强样本适应性;第四,我们引入网络输入与输出之间的残差连接,着重于网络学习真实图片与水印图片之间的差异,加快了模型收敛。因此在一个压缩包的数据训练下,我们从A榜的0.58307提升到了0.61742。然后我们使用5倍更多的数据进行模型训练,与少量数据训练一致,模型收敛很快,但性能仅仅提升到了0.61805。应该模型本身容量不足以及resize成512x512训练预测导致图片损失大量信息,但没有时间继续调整优化了。

三、数据处理

数据解压

这里使用了四个压缩包的数据,其实1个应该就足够了,或者按照比赛学习资料自动生成水印数据

In [1]
%cd data
!mkdir train
%cd train/
!mkdir image
!mkdir mask
%cd ../../
       
/home/aistudio/data
/home/aistudio/data/train
/home/aistudio
       
In [2]
! tar -xf data/data142446/watermark_datasets.part1.tar
!rm data/data142446/watermark_datasets.part1.tar
!cp -r watermark_datasets.part1 -d data/train/image
!rm -r watermark_datasets.part1/
! tar -xf data/data142446/watermark_datasets.part2.tar
!rm data/data142446/watermark_datasets.part2.tar
!cp -r watermark_datasets.part2 -d data/train/image
!rm -r watermark_datasets.part2/
! tar -xf data/data142446/watermark_datasets.part3.tar
!rm data/data142446/watermark_datasets.part3.tar
!cp -r watermark_datasets.part3 -d data/train/image
!rm -r watermark_datasets.part3/
! tar -xf data/data142446/watermark_datasets.part10.tar
!rm data/data142446/watermark_datasets.part10.tar
!cp -r watermark_datasets.part10 -d data/train/image
!rm -r watermark_datasets.part10/
! tar -xf data/data142446/bg_images.tar
!rm data/data142446/bg_images.tar
!cp -r bg_images -d data/train/mask/
!rm -r bg_images/
   

构造数据读取器

通过paddle.io.dataset构造读取器,便于读取数据。

数据预处理包括:

  1. 将带有水印和不带水印的图片均转化为(3,512,512)的形状
  2. 对图片进行归一化
In [1]
#划分训练集及验证集import os
watermark_dir = "data/train/image"bg_dir = "data/train/mask/bg_images"watermark_sub_dir = list(os.listdir(watermark_dir))
all_watermark_list = []for sub_dir in watermark_sub_dir:
    images_path = list(os.listdir(os.path.join(watermark_dir, sub_dir)))    for path in images_path:        if 'jpg' in path:
            all_watermark_list.append(os.path.join(sub_dir, path))

all_gt_list = list(os.listdir(bg_dir))
train_ratio = 0.985all_watermark_list = sorted(all_watermark_list)

train_len = int(train_ratio*len(all_watermark_list))
train_data_list = all_watermark_list[:train_len]
val_data_list = all_watermark_list[train_len:]print("total data num: {}, train num: {}, val_num: {}".format(len(all_watermark_list), len(train_data_list), len(val_data_list)))
       
total data num: 405020, train num: 398944, val_num: 6076
       
In [2]
import paddleimport osimport numpy as npimport pandas as pdimport cv2class MyDateset(paddle.io.Dataset):
    def __init__(self, mode = 'train', train_list=None, watermark_dir=None, bg_dir=None, data_transform=None):
        super(MyDateset, self).__init__()

        self.mode = mode 
        self.watermark_dir = watermark_dir
        self.bg_dir = bg_dir
        self.data_transform = data_transform

        self.train_list = train_list        print(len(self.train_list))    def __getitem__(self, index):
        item = self.train_list[index]
        
        bg_item = item.split('/')[1][:14]+'.jpg'

        img = cv2.imread(os.path.join(self.watermark_dir, item))
        label = cv2.imread(os.path.join(self.bg_dir, bg_item))

        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        label = cv2.cvtColor(label, cv2.COLOR_BGR2RGB)

        img = paddle.vision.transforms.resize(img, (512,512), interpolation='bilinear')
        label = paddle.vision.transforms.resize(label, (512,512), interpolation='bilinear')

        img = img.transpose((2,0,1))
        label = label.transpose((2,0,1))
        
        img = img/255
        label = label/255

        img = paddle.to_tensor(img).astype('float32')
        label = paddle.to_tensor(label).astype('float32')        return img,label    def __len__(self):
        return len(self.train_list)
   

四、定义网络结构

我们将Baseline的UNet模型做了四处修改:第一、我们选择使用leaky_relu,而不是relu,以尽可能的保留像素信息;第二、在Encoder与Decoder中间的过渡层,我们不是使用原来的单分支,而是以注意力为权重,建立双分支通路,将自适应加权结果变换后输入到Decoder中,增强模型容量;第三、同样地,水印输出,我们选择双分支输出,以注意力值为权重,增强样本适应性;第四,我们引入网络输入与输出之间的残差连接,着重于网络学习真实图片与水印图片之间的差异,加快了模型收敛。因此在一个压缩包的数据训练下,我们从A榜的0.58307提升到了0.61742。

In [3]
import paddlefrom paddle import nnimport paddle.nn.functional as Fclass CALayer(nn.Layer):
    def __init__(self, channels, reduction=16):
        super(CALayer, self).__init__()

        mid_c = max(channels//reduction, 16)
        self.conv1 = nn.Sequential(
            nn.Conv2D(channels, mid_c, 1),
            nn.ReLU(),
            nn.Conv2D(mid_c, channels, 1),
            nn.Sigmoid(),
            )    def forward(self, x):
        y = x.mean(axis=(-1, -2), keepdim=True)
        y = self.conv1(y)        return yclass Encoder(nn.Layer):#下采样:两层卷积,两层归一化,最后池化。
    def __init__(self, num_channels, num_filters):
        super(Encoder,self).__init__()#继承父类的初始化
        self.conv1 = nn.Conv2D(in_channels=num_channels,
                              out_channels=num_filters,
                              kernel_size=3,#3x3卷积核,步长为1,填充为1,不改变图片尺寸[H W]
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn1   = nn.BatchNorm(num_filters)#归一化,并使用了激活函数
        
        self.conv2 = nn.Conv2D(in_channels=num_filters,
                              out_channels=num_filters,
                              kernel_size=3,
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn2   = nn.BatchNorm(num_filters)
        
        self.pool  = nn.MaxPool2D(kernel_size=2,stride=2,padding="SAME")#池化层,图片尺寸减半[H/2 W/2]

        if num_channels!=num_filters:
            self.downsample = nn.Sequential(
                nn.Conv2D(num_channels, num_filters, 1, bias_attr=False),
                nn.BatchNorm2D(num_filters)
            )        else:
            self.downsample = lambda x: x        
    def forward(self,inputs):
        x = self.conv1(inputs)
        x = self.bn1(x)
        x = F.leaky_relu(x, 0.2)
        x = self.conv2(x)
        x = self.bn2(x)
        x = F.leaky_relu(x+self.downsample(inputs), 0.2)
        x_conv = x           #两个输出,灰色 ->
        x_pool = self.pool(x)#两个输出,红色 | 
        return x_conv, x_pool    
    
class Decoder(nn.Layer):#上采样:一层反卷积,两层卷积层,两层归一化
    def __init__(self, num_channels, num_filters):
        super(Decoder,self).__init__()
        self.up = nn.Conv2DTranspose(in_channels=num_channels,
                                    out_channels=num_filters,
                                    kernel_size=2,
                                    stride=2,
                                    padding=0,
                                    bias_attr=False)#图片尺寸变大一倍[2*H 2*W]
        self.up_bn   = nn.BatchNorm(num_filters)

        self.conv1 = nn.Conv2D(in_channels=num_filters*2,
                              out_channels=num_filters,
                              kernel_size=3,
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn1   = nn.BatchNorm(num_filters)
        
        self.conv2 = nn.Conv2D(in_channels=num_filters,
                              out_channels=num_filters,
                              kernel_size=3,
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn2   = nn.BatchNorm(num_filters)        if num_channels!=num_filters:
            self.upsample = nn.Sequential(
                nn.Conv2D(num_filters*2, num_filters, 1, bias_attr=False),
                nn.BatchNorm2D(num_filters)
            )        else:
            self.downsample = lambda x: x        
    def forward(self,input_conv,input_pool):
        x = self.up_bn(self.up(input_pool))
        x = F.leaky_relu(x, 0.2)
        h_diff = (input_conv.shape[2]-x.shape[2])
        w_diff = (input_conv.shape[3]-x.shape[3])
        pad = nn.Pad2D(padding=[h_diff//2, h_diff-h_diff//2, w_diff//2, w_diff-w_diff//2])
        x = pad(x)                                #以下采样保存的feature map为基准,填充上采样的feature map尺寸
        x = paddle.concat(x=[input_conv,x],axis=1)#考虑上下文信息,in_channels扩大两倍
        x_sc = self.upsample(x)
        x = self.conv1(x)
        x = self.bn1(x)
        x = F.leaky_relu(x, 0.2)
        x = self.conv2(x)
        x = self.bn2(x)
        x = F.leaky_relu(x+x_sc, 0.2)        return x    
class UNet(nn.Layer):
    def __init__(self,num_classes=3):
        super(UNet,self).__init__()
        self.down1 = Encoder(num_channels=  3, num_filters=64) #下采样
        self.down2 = Encoder(num_channels= 64, num_filters=128)
        self.down3 = Encoder(num_channels=128, num_filters=256)
        self.down4 = Encoder(num_channels=256, num_filters=512)
        
        self.mid_conv1 = nn.Sequential(
            nn.Conv2D(512,1024,1, bias_attr=False),
            nn.BatchNorm(1024),
            nn.LeakyReLU(0.2)
        )

        self.mid_conv2 = nn.Sequential(
            nn.Conv2D(512,1024,3, padding=1, bias_attr=False),
            nn.BatchNorm(1024),
            nn.LeakyReLU(0.2)
        )

        self.ca_layer1 = CALayer(1024, 32)

        self.mid_conv3 = nn.Sequential(
            nn.Conv2D(1024,1024,1, bias_attr=False),
            nn.BatchNorm(1024),
            nn.LeakyReLU(0.2)
        )

        self.up4 = Decoder(1024,512)                           #上采样
        self.up3 = Decoder(512,256)
        self.up2 = Decoder(256,128)
        self.up1 = Decoder(128,64)
        
        self.last_conv1 = nn.Conv2D(64,num_classes,1)           #1x1卷积,softmax做分类
        self.last_conv2 = nn.Conv2D(64,num_classes,3, padding=1)

        self.ca_layer2 = CALayer(num_classes)        
    def forward(self,inputs):
        x1, x = self.down1(inputs)
        x2, x = self.down2(x)
        x3, x = self.down3(x)
        x4, x = self.down4(x)
        
        x_m1 = self.mid_conv1(x)
        x_m2 = self.mid_conv2(x)
        attn = self.ca_layer1(x_m1+x_m2)
        x = x_m1*attn+x_m2*(1.-attn)
        x = self.mid_conv3(x)
        
        x = self.up4(x4, x)
        x = self.up3(x3, x)
        x = self.up2(x2, x)
        x = self.up1(x1, x)
        
        out1 = self.last_conv1(x)
        out2 = self.last_conv2(x)
        attn = self.ca_layer2(out1+out2)
        x = out1*attn+out2*(1.-attn)        
        return inputs-x# 查看网络各个节点的输出信息#paddle.summary(UNet(), (1, 3, 600, 600))
   
In [ ]
net = UNet()
train_dataset = MyDateset(train_list=train_data_list, watermark_dir=watermark_dir, bg_dir=bg_dir)
train_loader = paddle.io.DataLoader(
    train_dataset,
    batch_size=1,
    shuffle=True,
    drop_last=False)for data in train_loader:
    img, label = data
    pred = net(img)    break
   

五、定义loss

同样秉承着拿来主义的思想,从图像评价指标PSNR、SSIM以及MS-SSIM 复制一份MSSSIM代码过来。

宣小二
宣小二

宣小二:媒体发稿平台,自媒体发稿平台,短视频矩阵发布平台,基于AI驱动的企业自助式投放平台。

下载

~看不看得懂代码不重要,重要是看得懂文字,明白大佬已经写好了一个现成的直接调用的loss函数~

当然,仅有MSSSIM是不够的,还可以再根据通过Sub-Pixel实现图像超分辨率写一个PSNR的损失函数。

In [5]
import paddleimport paddle.nn.functional as Fdef gaussian1d(window_size, sigma):
    ###window_size = 11
    x = paddle.arange(window_size,dtype='float32')
    x = x - window_size//2
    gauss = paddle.exp(-x ** 2 / float(2 * sigma ** 2))    # print('gauss.size():', gauss.size())
    ### torch.Size([11])
    return gauss / gauss.sum()def create_window(window_size, sigma, channel):
    _1D_window = gaussian1d(window_size, sigma).unsqueeze(1)
    _2D_window = _1D_window.mm(_1D_window.t()).unsqueeze(0).unsqueeze(0)    # print('2d',_2D_window.shape)
    # print(window_size, sigma, channel)
    return _2D_window.expand([channel,1,window_size,window_size])def _ssim(img1, img2, window, window_size, channel=3 ,data_range = 255.,size_average=True,C=None):
    # size_average for different channel

    padding = window_size // 2

    mu1 = F.conv2d(img1, window, padding=padding, groups=channel)
    mu2 = F.conv2d(img2, window, padding=padding, groups=channel)    # print(mu1.shape)
    # print(mu1[0,0])
    # print(mu1.mean())
    mu1_sq = mu1.pow(2)
    mu2_sq = mu2.pow(2)
    mu1_mu2 = mu1 * mu2
    sigma1_sq = F.conv2d(img1 * img1, window, padding=padding, groups=channel) - mu1_sq
    sigma2_sq = F.conv2d(img2 * img2, window, padding=padding, groups=channel) - mu2_sq
    sigma12 = F.conv2d(img1 * img2, window, padding=padding, groups=channel) - mu1_mu2    if C ==None:
        C1 = (0.01*data_range) ** 2
        C2 = (0.03*data_range) ** 2
    else:
        C1 = (C[0]*data_range) ** 2
        C2 = (C[1]*data_range) ** 2
    # l = (2 * mu1_mu2 + C1) / (mu1_sq + mu2_sq + C1)
    # ssim_map = ((2 * mu1_mu2 + C1) * (2 * sigma12 + C2)) / ((mu1_sq + mu2_sq + C1) * (sigma1_sq + sigma2_sq + C2))
    sc = (2 * sigma12 + C2) / (sigma1_sq + sigma2_sq + C2)
    lsc = ((2 * mu1_mu2 + C1) / (mu1_sq + mu2_sq + C1))*sc    if size_average:        ### ssim_map.mean()是对这个tensor里面的所有的数值求平均
        return lsc.mean()    else:        # ## 返回各个channel的值
        return lsc.flatten(2).mean(-1),sc.flatten(2).mean(-1)def ms_ssim(
    img1, img2,window, data_range=255, size_average=True, window_size=11, channel=3, sigma=1.5, weights=None, C=(0.01, 0.03)):

    r""" interface of ms-ssim
    Args:
        img1 (torch.Tensor): a batch of images, (N,C,[T,]H,W)
        img2 (torch.Tensor): a batch of images, (N,C,[T,]H,W)
        data_range (float or int, optional): value range of input images. (usually 1.0 or 255)
        size_average (bool, optional): if size_average=True, ssim of all images will be averaged as a scalar
        win_size: (int, optional): the size of gauss kernel
        win_sigma: (float, optional): sigma of normal distribution
        win (torch.Tensor, optional): 1-D gauss kernel. if None, a new kernel will be created according to win_size and win_sigma
        weights (list, optional): weights for different levels
        K (list or tuple, optional): scalar constants (K1, K2). Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.
    Returns:
        torch.Tensor: ms-ssim results
    """
    if not img1.shape == img2.shape:        raise ValueError("Input images should have the same dimensions.")    # for d in range(len(img1.shape) - 1, 1, -1):
    #     img1 = img1.squeeze(dim=d)
    #     img2 = img2.squeeze(dim=d)

    if not img1.dtype == img2.dtype:        raise ValueError("Input images should have the same dtype.")    if len(img1.shape) == 4:
        avg_pool = F.avg_pool2d    elif len(img1.shape) == 5:
        avg_pool = F.avg_pool3d    else:        raise ValueError(f"Input images should be 4-d or 5-d tensors, but got {img1.shape}")

    smaller_side = min(img1.shape[-2:])    assert smaller_side > (window_size - 1) * (2 ** 4), "Image size should be larger than %d due to the 4 downsamplings " \                                                        "with window_size %d in ms-ssim" % ((window_size - 1) * (2 ** 4),window_size)    if weights is None:
        weights = [0.0448, 0.2856, 0.3001, 0.2363, 0.1333]
    weights = paddle.to_tensor(weights)    if window is None:
        window = create_window(window_size, sigma, channel)    assert window.shape == [channel, 1, window_size, window_size], " window.shape error"

    levels = weights.shape[0] # 5
    mcs = []    for i in range(levels):
        ssim_per_channel, cs =  _ssim(img1, img2, window=window, window_size=window_size,
                                       channel=3, data_range=data_range,C=C, size_average=False)        if i < levels - 1:
            mcs.append(F.relu(cs))
            padding = [s % 2 for s in img1.shape[2:]]
            img1 = avg_pool(img1, kernel_size=2, padding=padding)
            img2 = avg_pool(img2, kernel_size=2, padding=padding)

    ssim_per_channel = F.relu(ssim_per_channel)  # (batch, channel)
    mcs_and_ssim = paddle.stack(mcs + [ssim_per_channel], axis=0)  # (level, batch, channel) 按照等级堆叠
    ms_ssim_val = paddle.prod(mcs_and_ssim ** weights.reshape([-1, 1, 1]), axis=0) # level 相乘
    print(ms_ssim_val.shape)    if size_average:        return ms_ssim_val.mean()    else:        # 返回各个channel的值
        return ms_ssim_val.flatten(2).mean(1)class SSIMLoss(paddle.nn.Layer):
   """
   1. 继承paddle.nn.Layer
   """
   def __init__(self, window_size=11, channel=3, data_range=255., sigma=1.5):
       """
       2. 构造函数根据自己的实际算法需求和使用需求进行参数定义即可
       """
       super(SSIMLoss, self).__init__()
       self.data_range = data_range
       self.C = [0.01, 0.03]
       self.window_size = window_size
       self.channel = channel
       self.sigma = sigma
       self.window = create_window(self.window_size, self.sigma, self.channel)       # print(self.window_size,self.window.shape)
   def forward(self, input, label):
       """
       3. 实现forward函数,forward在调用时会传递两个参数:input和label
           - input:单个或批次训练数据经过模型前向计算输出结果
           - label:单个或批次训练数据对应的标签数据
           接口返回值是一个Tensor,根据自定义的逻辑加和或计算均值后的损失
       """
       # 使用Paddle中相关API自定义的计算逻辑
       # output = xxxxx
       # return output
       return 1-_ssim(input, label,data_range = self.data_range,
                      window = self.window, window_size=self.window_size, channel=3,
                      size_average=True,C=self.C)class MS_SSIMLoss(paddle.nn.Layer):
   """
   1. 继承paddle.nn.Layer
   """
   def __init__(self,data_range=255., channel=3, window_size=11, sigma=1.5):
       """
       2. 构造函数根据自己的实际算法需求和使用需求进行参数定义即可
       """
       super(MS_SSIMLoss, self).__init__()
       self.data_range = data_range
       self.C = [0.01, 0.03]
       self.window_size = window_size
       self.channel = channel
       self.sigma = sigma
       self.window = create_window(self.window_size, self.sigma, self.channel)       # print(self.window_size,self.window.shape)
   def forward(self, input, label):
       """
       3. 实现forward函数,forward在调用时会传递两个参数:input和label
           - input:单个或批次训练数据经过模型前向计算输出结果
           - label:单个或批次训练数据对应的标签数据
           接口返回值是一个Tensor,根据自定义的逻辑加和或计算均值后的损失
       """
       # 使用Paddle中相关API自定义的计算逻辑
       # output = xxxxx
       # return output
       return 1-ms_ssim(input, label, data_range=self.data_range,
                      window = self.window, window_size=self.window_size, channel=self.channel,
                      size_average=True,  sigma=self.sigma,
                      weights=None, C=self.C)class PSNRLoss(paddle.nn.Layer):
   def __init__(self):
       super(PSNRLoss, self).__init__()   def forward(self, input, label):
       return 100 - 20 * paddle.log10( ((input - label)**2).mean(axis = [1,2,3])**-0.5 )
   

六、训练

In [6]
def seed_paddle(seed=1024):
    seed = int(seed)
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    paddle.seed(seed)
   
In [7]
class AverageMeter(object):
    """Computes and stor|es the average and current value"""
    def __init__(self):
        self.reset()    def reset(self):
        self.val = 0
        self.avg = 0
        self.sum = 0
        self.count = 0

    def update(self, val, n=1):
        self.val = val
        self.sum += val * n
        self.count += n
        self.avg = self.sum / self.countdef evaluate(val_loader, model, criterion, print_interval=100):
    losses = AverageMeter()
    psnr = AverageMeter()
    ssim = AverageMeter()
    batch_time = AverageMeter()
    lossfn, losspsnr = criterion    for step, data in enumerate(val_loader):

        img, label = data
        end = time.time()
        pre = model(img)
        batch_time.update(time.time() - end)
        loss1 = lossfn(pre,label).mean()
        loss2 = losspsnr(pre,label).mean()
        loss = (loss1+loss2/100)/2

        losses.update(loss.item(), img.shape[0])
        psnr.update(100.-loss2.item(), img.shape[0])
        ssim.update(1.-loss1.item(), img.shape[0])        if step%print_interval==0:            print('Test: [{0}/{1}]\t'
                'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
                'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
                'SSIM {ssim.val:.3f} ({ssim.avg:.3f})\t'
                'PSNR {psnr.val:.3f} ({psnr.avg:.3f})'.format(
                step,                len(val_loader),
                batch_time=batch_time,
                loss=losses,
                ssim=ssim,
                psnr=psnr))    print(' * SSIM {ssim.avg:.3f} PSNR {psnr.avg:.3f} Time {batch_time.avg:.3f}'
            .format(ssim=ssim, psnr=psnr, batch_time=batch_time))    return losses.avg, ssim.avg, psnr.avg
   
In [8]
batch_size = 8max_epoch = 10init_lr = 0.005print_interval = 100val_interval = 1save_dir = "models/output"save_interval = 1save_interval_s = 8000start_epoch = 0start_step = 0init_loss = 999
   
In [12]
import time 
from visualdl import LogWriter

model = UNet()
model.train()

train_dataset = MyDateset(train_list=train_data_list, watermark_dir=watermark_dir, bg_dir=bg_dir)
val_dataset = MyDateset(train_list=val_data_list, watermark_dir=watermark_dir, bg_dir=bg_dir)# 需要接续之前的模型重复训练可以取消注释if start_epoch>0:    #param_dict = paddle.load(os.path.join(save_dir, 'model_step_{}.pdparams'.format(str(start_step))))
    param_dict = paddle.load(os.path.join(save_dir, 'model_{}.pdparams'.format(str(start_epoch-1))))
    model.set_state_dict(param_dict)

train_loader = paddle.io.DataLoader(
    train_dataset,
    batch_size=batch_size,
    shuffle=True,
    drop_last=False)

val_loader = paddle.io.DataLoader(
    val_dataset,
    batch_size=batch_size,
    shuffle=False,
    drop_last=False)

losspsnr = PSNRLoss()
lossfn = SSIMLoss(window_size=3,data_range=1)

scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=init_lr, T_max=max_epoch)
opt = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())if start_epoch>0:    pass
    #param_dict = paddle.load(os.path.join(save_dir, 'opt_step_{}.pdopt'.format(str(start_step))))
    param_dict = paddle.load(os.path.join(save_dir, 'opt_{}.pdopt'.format(str(start_epoch-1))))
    opt.set_state_dict(param_dict)

writer = LogWriter(os.path.join(save_dir, "logs"))
       
398944
6076
       
In [ ]
now_step = start_epoch*len(train_loader)
min_loss = init_lossfor epoch in range(start_epoch, max_epoch):
    losses = AverageMeter()
    psnr = AverageMeter()
    ssim = AverageMeter()
    batch_time = AverageMeter()
    data_time = AverageMeter()
    end = time.time()    for step, data in enumerate(train_loader):        if epoch==start_epoch and step>=((start_epoch+1)*len(train_loader)-now_step):            #print(step+start_epoch*len(train_loader))
            break

        img, label = data
        data_time.update(time.time() - end)
        pre = model(img)
        loss1 = lossfn(pre,label).mean()
        loss2 = losspsnr(pre,label).mean()
        loss = (loss1+loss2/100)/2

        loss.backward()
        opt.step()
        opt.clear_gradients()

        losses.update(loss.item(), img.shape[0])
        psnr.update(100.-loss2.item(), img.shape[0])
        ssim.update(1.-loss1.item(), img.shape[0])
        batch_time.update(time.time() - end)        
        if now_step%print_interval==0:
            writer.add_scalar('train/loss', losses.val, now_step)
            writer.add_scalar('train/ssim', ssim.val, now_step)
            writer.add_scalar('train/psnr', psnr.val, now_step)            print('Epoch: [{0}][{1}/{2}]\t'
                  'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
                  'Data {data_time.val:.3f} ({data_time.avg:.3f})\t'
                  'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
                  'SSIM {ssim.val:.3f} ({ssim.avg:.3f})\t'
                  'PSNR {psnr.val:.3f} ({psnr.avg:.3f})'.format(
                    epoch,
                    step,                    len(train_loader),
                    batch_time=batch_time,
                    data_time=data_time,
                    loss=losses,
                    ssim=ssim,
                    psnr=psnr))        if now_step%save_interval_s==0:
            paddle.save(model.state_dict(), os.path.join(save_dir, 'model_step_{}.pdparams'.format(str(now_step))))
            paddle.save(opt.state_dict(), os.path.join(save_dir, 'opt_step_{}.pdopt'.format(str(now_step))))
        now_step += 1
        end = time.time()

    writer.add_scalar('train/lr', opt.get_lr(), epoch)
    scheduler.step()    if epoch%val_interval==0:        with paddle.no_grad():
            model.eval()
            val_loss, val_ssim, val_psnr = evaluate(val_loader, model, criterion=(lossfn, losspsnr), print_interval=print_interval)
            model.train()
            writer.add_scalar('val/loss', val_loss, epoch)
            writer.add_scalar('val/ssim', val_ssim, epoch)
            writer.add_scalar('val/psnr', val_psnr, epoch)        if val_loss
    

七、预测及结果提交

本题目提交需要提交对应的模型和预测文件。predict.py需要读取同目录下的模型信息,预测去水印后的图片并保存。

想要自定义训练模型,只需要将predict.py中的模型和process函数中的do something 替换为自己的模型内容即可。

Baseline说:直接用UNet处理的结果可能不够理想,并非所有的情况都需要通过修正网络来解决。以下述情况为例,把在某个阈值内的颜色都设定为黑色(字的颜色)/白色(背景的颜色),可以让处理结果更契合人眼的需求。在predict.py中已经通过以下语句包含了这样的处理策略:

pre[pre>0.9]=1pre[pre<0.1]=0
       

但我们去除了这个策略,因为我们A榜分数发现,去除后分数从0.61805升到0.61919。

In [1]
# 压缩可提交文件! zip submit_removal.zip model_best.pdparams predict.py
       
  adding: model_best.pdparams (deflated 7%)
  adding: predict.py (deflated 72%)
       

查看预测结果(可选、非常耗时)

是不是想知道自己训练后的网络去除水印之后的图片到底长啥样?直接下载测试集A看看效果吧~

下载测试集

In [ ]
! wget https://staticsns.cdn.bcebos.com/amis/2022-4/1649745356784/watermark_test_datasets.zip! unzip -oq watermark_test_datasets.zip! rm -rf watermark_test_datasets.zip
   

在测试集上预测

In [ ]
! python predict.py watermark_test_datasets/images results
   

预测结束之后,打开results文件夹就能看到去除水印的图片了~

以图片bg_image_00005_0002.jpg为例

with watermask without watermask
@@##@@                     @@##@@

                   
百度网盘AI大赛-水印智能消除赛第19名方案 - php中文网百度网盘AI大赛-水印智能消除赛第19名方案 - php中文网

相关文章

百度网盘
百度网盘

百度网盘是一款省心、好用的超级云存储产品,已为超过7亿用户提供云服务,空间超大,支持多类型文件的备份、分享、查看和处理,自建多个数据存储中心。有需要的小伙伴快来保存下载体验吧!

下载

本站声明:本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

相关专题

更多
页面置换算法
页面置换算法

页面置换算法是操作系统中用来决定在内存中哪些页面应该被换出以便为新的页面提供空间的算法。本专题为大家提供页面置换算法的相关文章,大家可以免费体验。

403

2023.08.14

页面置换算法
页面置换算法

页面置换算法是操作系统中用来决定在内存中哪些页面应该被换出以便为新的页面提供空间的算法。本专题为大家提供页面置换算法的相关文章,大家可以免费体验。

403

2023.08.14

Java JVM 原理与性能调优实战
Java JVM 原理与性能调优实战

本专题系统讲解 Java 虚拟机(JVM)的核心工作原理与性能调优方法,包括 JVM 内存结构、对象创建与回收流程、垃圾回收器(Serial、CMS、G1、ZGC)对比分析、常见内存泄漏与性能瓶颈排查,以及 JVM 参数调优与监控工具(jstat、jmap、jvisualvm)的实战使用。通过真实案例,帮助学习者掌握 Java 应用在生产环境中的性能分析与优化能力。

19

2026.01.20

PS使用蒙版相关教程
PS使用蒙版相关教程

本专题整合了ps使用蒙版相关教程,阅读专题下面的文章了解更多详细内容。

61

2026.01.19

java用途介绍
java用途介绍

本专题整合了java用途功能相关介绍,阅读专题下面的文章了解更多详细内容。

87

2026.01.19

java输出数组相关教程
java输出数组相关教程

本专题整合了java输出数组相关教程,阅读专题下面的文章了解更多详细内容。

39

2026.01.19

java接口相关教程
java接口相关教程

本专题整合了java接口相关内容,阅读专题下面的文章了解更多详细内容。

10

2026.01.19

xml格式相关教程
xml格式相关教程

本专题整合了xml格式相关教程汇总,阅读专题下面的文章了解更多详细内容。

13

2026.01.19

PHP WebSocket 实时通信开发
PHP WebSocket 实时通信开发

本专题系统讲解 PHP 在实时通信与长连接场景中的应用实践,涵盖 WebSocket 协议原理、服务端连接管理、消息推送机制、心跳检测、断线重连以及与前端的实时交互实现。通过聊天系统、实时通知等案例,帮助开发者掌握 使用 PHP 构建实时通信与推送服务的完整开发流程,适用于即时消息与高互动性应用场景。

19

2026.01.19

热门下载

更多
网站特效
/
网站源码
/
网站素材
/
前端模板

精品课程

更多
相关推荐
/
热门推荐
/
最新课程
最新Python教程 从入门到精通
最新Python教程 从入门到精通

共4课时 | 7.9万人学习

Django 教程
Django 教程

共28课时 | 3.3万人学习

SciPy 教程
SciPy 教程

共10课时 | 1.2万人学习

关于我们 免责申明 举报中心 意见反馈 讲师合作 广告合作 最新更新
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送

Copyright 2014-2026 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号