QQ登录

只需要一步,快速开始

 注册地址  找回密码
查看: 10995|回复: 2
打印 上一主题 下一主题

稀疏自编码器(Sparse Autoencoder)自学习代码

[复制链接]
字体大小: 正常 放大
madio        

3万

主题

1307

听众

5万

积分

  • TA的每日心情
    奋斗
    2021-5-1 20:26
  • 签到天数: 2013 天

    [LV.Master]伴坛终老

    自我介绍
    数学中国站长

    社区QQ达人 邮箱绑定达人 优秀斑竹奖 发帖功臣 风雨历程奖 新人进步奖 最具活力勋章

    群组数学建模培训课堂1

    群组数学中国美赛辅助报名

    群组Matlab讨论组

    群组2013认证赛A题讨论群组

    群组2013认证赛C题讨论群组

    跳转到指定楼层
    1#
    发表于 2019-3-23 11:05 |只看该作者 |倒序浏览
    |招呼Ta 关注Ta |邮箱已经成功绑定
    稀疏自编码器(Sparse Autoencoder)可以自动从无标注数据中学习特征,可以给出比原始数据更好的特征描述。在实际运用时可以用稀疏编码器发现的特征取代原始数据,这样往往能带来更好的结果。本文将给出稀疏自编码器的算法描述,并演示说明稀疏编码器自动提取边缘特征。
    下面,给出稀疏自编码器代价函数及其导数的matlab代码实现:

    function [cost,grad] = sparseAutoencoderCost(theta, visibleSize, hiddenSize, ...
                                                 lambda, sparsityParam, beta, data)

    % visibleSize: the number of input units (probably 64)
    % hiddenSize: the number of hidden units (probably 25)
    % lambda: weight decay parameter
    % sparsityParam: The desired average activation for the hidden units (denoted in the lecture
    %                           notes by the greek alphabet rho, which looks like a lower-case "p").
    % beta: weight of sparsity penalty term
    % data: Our 64x10000 matrix containing the training data.  So, data(:,i) is the i-th training example.

    % The input theta is a vector (because minFunc expects the parameters to be a vector).
    % We first convert theta to the (W1, W2, b1, b2) matrix/vector format, so that this
    % follows the notation convention of the lecture notes.

    W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);
    W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize);
    b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
    b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end);

    % Cost and gradient variables (your code needs to compute these values).
    % Here, we initialize them to zeros.
    cost = 0;
    W1grad = zeros(size(W1));
    W2grad = zeros(size(W2));
    b1grad = zeros(size(b1));
    b2grad = zeros(size(b2));

    %% ---------- YOUR CODE HERE --------------------------------------
    %  Instructions: Compute the cost/optimization objective J_sparse(W,b) for the Sparse Autoencoder,
    %                and the corresponding gradients W1grad, W2grad, b1grad, b2grad.
    %
    % W1grad, W2grad, b1grad and b2grad should be computed using backpropagation.
    % Note that W1grad has the same dimensions as W1, b1grad has the same dimensions
    % as b1, etc.  Your code should set W1grad to be the partial derivative of J_sparse(W,b) with
    % respect to W1.  I.e., W1grad(i,j) should be the partial derivative of J_sparse(W,b)
    % with respect to the input parameter W1(i,j).  Thus, W1grad should be equal to the term
    % [(1/m) \Delta W^{(1)} + \lambda W^{(1)}] in the last block of pseudo-code in Section 2.2
    % of the lecture notes (and similarly for W2grad, b1grad, b2grad).
    %
    % Stated differently, if we were using batch gradient descent to optimize the parameters,
    % the gradient descent update to W1 would be W1 := W1 - alpha * W1grad, and similarly for W2, b1, b2.
    %
    m=size(data,2);

    x=data;
    a1=x;
    z2=W1*a1+repmat(b1,1,m);
    a2=sigmoid(z2);
    z3=W2*a2+repmat(b2,1,m);
    a3=sigmoid(z3);
    h=a3;
    y=x;
    squared_error=0.5*sum((h-y).^2,1);
    rho=1/m*sum(a2,2);
    sparsity_penalty= beta*sum(sparsityParam.*log(sparsityParam./rho)+(1-sparsityParam).*log((1-sparsityParam)./(1-rho)));
    cost=1/m*sum(squared_error)+lambda/2*(sum(sum(W1.^2))+sum(sum(W2.^2))) + sparsity_penalty;


    grad_z3=a3.*(1-a3);
    delta_3=-(y-a3).*grad_z3;
    grad_z2=a2.*(1-a2);
    delta_2=(W2'*delta_3+repmat(beta*(-sparsityParam./rho+(1-sparsityParam)./(1-rho)),1,m)).*grad_z2;
    Delta_W2=delta_3*a2';
    Delta_b2=sum(delta_3,2);
    Delta_W1=delta_2*a1';
    Delta_b1=sum(delta_2,2);
    W1grad=1/m*Delta_W1+lambda*W1;
    W2grad=1/m*Delta_W2+lambda*W2;
    b1grad=1/m*Delta_b1;
    b2grad=1/m*Delta_b2;


    %-------------------------------------------------------------------
    % After computing the cost and gradient, we will convert the gradients back
    % to a vector format (suitable for minFunc).  Specifically, we will unroll
    % your gradient matrices into a vector.

    grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];

    end

    %-------------------------------------------------------------------
    % Here's an implementation of the sigmoid function, which you may find useful
    % in your computation of the costs and the gradients.  This inputs a (row or
    % column) vector (say (z1, z2, z3)) and returns (f(z1), f(z2), f(z3)).

    function sigm = sigmoid(x)

        sigm = 1 ./ (1 + exp(-x));
    end
    ---------------------




    zan
    转播转播0 分享淘帖0 分享分享0 收藏收藏0 支持支持0 反对反对0 微信微信
    数学建模社会化
    huangma        

    0

    主题

    5

    听众

    230

    积分

    升级  65%

  • TA的每日心情
    开心
    2013-8-4 12:25
  • 签到天数: 1 天

    [LV.1]初来乍到

    群组2015年数学中国“建模

    回复

    使用道具 举报

    huangma        

    0

    主题

    5

    听众

    230

    积分

    升级  65%

  • TA的每日心情
    开心
    2013-8-4 12:25
  • 签到天数: 1 天

    [LV.1]初来乍到

    群组2015年数学中国“建模

    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 注册地址

    qq
    收缩
    • 电话咨询

    • 04714969085
    fastpost

    关于我们| 联系我们| 诚征英才| 对外合作| 产品服务| QQ

    手机版|Archiver| |繁體中文 手机客户端  

    蒙公网安备 15010502000194号

    Powered by Discuz! X2.5   © 2001-2013 数学建模网-数学中国 ( 蒙ICP备14002410号-3 蒙BBS备-0002号 )     论坛法律顾问:王兆丰

    GMT+8, 2024-5-9 22:23 , Processed in 0.379896 second(s), 61 queries .

    回顶部