Question 2

Question 2.5
We need to nd the update rule for weights w
j k between the
k-th input neuron and the j-th hidden
neuron. Therefore we need to derive the error with respect to those weights:
@ E @ w
j k=
@ E @ y
j
@ y
j @ net
j
@ net
j @ w
j k (Chain Rule)
where net
j= I
X
k =1 w
j k y
k and
y
j = p x
2
+ 1 1 2
+
x
We will start with @ net
j @ w
j k:
@ net j @ w
j k =
@ @ w
j k(I
X
k =1 w
j k y
k)
= @ @ w
j k(
w
j1 y
1 +
w
j2 y
2 +
: : : +w
j k y
k +
: : : +w
j I y
I) (
Expandingnet
j)
= @ @ w
j k(
w
j1 y
1) + @ @ w
j k(
w
j2 y
2) +
: : :+ @ @ w
j k(
w
j k y
k) +
: : :+ @ @ w
j k(
w
j I y
I)
(Sum Rule)
= @ @ w
j k(
w
j k y
k)
= y
k
!
Now we derive the second term @ y
j @ net
j:
@ y j @ net
j=
@ @ net
j(
1 2
(q net
2
j + 1
1) + net
j)
= @ @ net
j(
1 2
(q net
2
j + 1
1)) + @ @ net
j(
net
j)
(Sum Rule)
RH S = 1
LH S =1 2
(
@ @ net
j(q net
2
j + 1)
@ @ net
j(1))
(Sum Rule and take out constant)
= 1 2
@ @ net
j(q net
2
j + 1)
1

Let
u= net 2
j + 1
Using the chain rule we get: 1 2
(
@ y
j @ net
j=
@ @ u
(p u
) @ @ net
j(
net 2
j + 1))
= 1 2
(
@ @ u
(
u 1 2
) @ @ net
j(
net 2
j ) + @ @ net
j(1)) (Sum Rule and simplifying)
= 1 2
(
1 2
u

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

1 2
2net
j)
(Power Rule)
= 1 2
(
net
j p
u
)
(2s cancel out and simplify)
= net
j 2
p u
Substituting net2
j + 1 back in for
uwe get:
LH S = net
j 2
q net
2
j + 1
) @ y
j @ net
j=
net
j 2
q net
2
j + 1 + 1
! (Substituting RHS and LHS)
Now since the error function (E) is only calculated after the output neurons, E can be seen as a
function of all neurons that receive input from neuron j of the hidden layer (i.e. the j-th output
neuron). Thus @ E @ y
jcan be represented as:
@ E @ y
j= O
X
i =1 (
@ E @ y
[email protected] y
i @ net
i
@ net
i @ y
j)
Start with @ net
i @ y
j where
net
i= P
H
j =1 w
ijy
j:
@ net i @ y
j =
@ @ y
j( H
X
j =1 w
ijy
j)
= @ @ y
j(
w
i1 y
1 +
w
i2 y
2 +
: : : +w
ijy
j +
: : : +w
iH y
H ) (
Expandingnet
i)
= @ @ y
j(
w
i1 y
1) + @ @ y
j(
w
i2 y
2) +
: : :+ @ @ y
j(
w
ijy
j) +
: : :+ @ @ y
j(
w
iH y
H )
(Sum Rule)
= @ @ y
j(
w
ijy
j)
= w
ij
! (1)
2

Now solve
@ y
i @ net
i:
@ y i @ net
i=
@ @ net
i(
net
i)
= 1
! (2)
Lastly solve @ E @ y
i:
@ E @ y
i=
@ @ y
i( O
X
i =1 (ln(
t
i + 1)
ln( y
i + 1)) 2
)
= @ @ y
i((ln(
t
1 + 1)
ln( y
1 + 1)) 2
+ : : : + (ln( t
i + 1)
ln( y
i + 1)) 2
+ : : :
+ (ln( t
O + 1)
ln( y
O + 1)) 2
) ( ExpandingE)
= @ @ y
i((ln(
t
1 + 1)
ln( y
1 + 1)) 2
) + : : :+ @ @ y
i((ln(
t
i + 1)
ln( y
i + 1)) 2
) + : : :
+ @ @ y
i((ln(
t
O + 1)
ln( y
O + 1)) 2
) ( Sum Rule)
= @ @ y
i((ln(
t
i + 1)
ln( y
i + 1)) 2
)
Let u= ln( t
i + 1)
ln(y
i + 1)
= @ @ u
(
u 2
) @ @ y
i(ln(
t
i + 1)
ln( y
i + 1))
@ @ u
(
u 2
) = 2 u (Chain Rule)
@ @ y
i(ln(
t
i + 1)
ln( y
i + 1)) = @ @ y
i(ln(
t
i + 1))
@ @ y
i(ln(
y
i + 1))
(Sum Rule)
Let p= y
i + 1
@ @ y
i(ln(
t
i + 1)) = 0
@ @ y
i(ln(
y
i + 1)) = @ @ p
(ln(
p)) @ @ y
i(
y
i + 1)
(Chain Rule)
= 1 p

1
@ @ y
i(ln(
t
i + 1)
ln( y
i + 1)) = 0
1 p
Substituting y
i + 1back in for
pand ln( t
i + 1)
ln( y
i + 1) back in for
uwe get:
@ E @ y
i=
2(ln(
t
i + 1)
ln( y
i + 1) y
i + 1
!
(3)
3

Now substituting the answers from equations1,2and3back into
@ E @ y
jwe get:
@ E @ y
j= O
X
i =1
2(ln(
t
i + 1)
ln( y
i + 1)) y
i + 1
1 w
ij
!
For simplicity sake the rest of the answer will refer to @ E @ y
jas

Now plugging in and the other parts of the original equation to @ E @ w
j k:
@ E @ w
j k=
y
k( net
j 2
q net
2
j + 1 + 1)
When updating the weights between the input and hidden layers the following formula is used:
w j k =
w
j k +
w
j k
to complete this formula w
j k is as follows:
w
j k =
(y
k( net
j 2
q net
2
j + 1 + 1)) Where
is the learning rate
The nal formula is:
w j k =
y
k( net
j 2
q net
2
j + 1 + 1) +
w
j k
!
4