Pages: [1]
  Print  
Author Topic: Neural Network functioning  (Read 548 times)
hagen85
Newbie
*
Posts: 18


« on: January 31, 2012, 10:11:57 PM »

Hi there,

I was doing some testing with the neural network operator and was wondering if I am mistaken in my understanding how it works:
Lets assume you have 1000 data instances(rows) where for each woman from a certain region and above the age of 25 a target variable has the value "yes". For men from the same region and of the same age group the target variable is "no". Then you have another 1000 instances where it is vice versa. If I train (no X-Validation) the NNW on the first 1500 instances (ordered) and switch off shuffling in the operator, shouldn´t the network somehow "forget" that women lead to yes? Apparently it is not, because if I apply the model to the 500 instance left the classification rate is very poor. I have tried different learning rates and momentums.

Thank you in advance for your ideas.
Regards
Hagen

Logged
hagen85
Newbie
*
Posts: 18


« Reply #1 on: February 18, 2012, 08:32:24 PM »

Hi again,

maybe I have to ask this differently... Normally NNW can be trained in batch or in on-line mode. How can  I use the operator in rapidminer in online mode(presenting one example at a time)?

Regards
Hagen
Logged
Marius
Global Moderator
Sr. Member
*****
Posts: 370



WWW
« Reply #2 on: February 29, 2012, 11:50:10 AM »

Hi Hagen,

RapidMiner does not (yet) support online or stream learning, but we are planning to release a stream mining framework in the future. After that, the operators and algorithms must be adapted to handle streams.

Currently, only the Naive Bayes model supports online learning.

Best,
Marius
Logged

Please add [SOLVED] to the topic title when your problem has been solved! (do so by editing the first post in the thread and modifying the title)
hagen85
Newbie
*
Posts: 18


« Reply #3 on: March 26, 2012, 11:09:58 PM »

HI,
thanks for your reply. I made an observation which seems totally strange to me:
I have a dataset which contains two concepts which differ significantly.
I use Sliding-Window-Validation with cumulative learning and a neural net inside.
What happens now is, if I switch off "shuffle" in the neural net operator it almost perfectly classifies my data, meaning it adjusts to the concept drift.

I do not understand that at all Smiley. Isn t the error minimized over the whole dataset, which would mean that the effects of both concepts balance each other out?

I would be very grateful for ideas on that.

Regards
Hagen
« Last Edit: March 31, 2012, 07:00:49 PM by hagen85 » Logged
hagen85
Newbie
*
Posts: 18


« Reply #4 on: March 31, 2012, 08:38:36 PM »

Hi there, me again :-)..
Sorry for pushing on that, but I am using rapid miner for my thesis and therefore need to ensure that I understand who it works. Regarding my last post: if I have a look at the source code of the ImprovedNeuralNetModel.java method public void train(...
Code:
00114         // optimization loop
00115         for (int cycle = 0; cycle < maxCycles; cycle++) {
00116             double error = 0;
00117             int maxSize = exampleSet.size();
00118             for (int index = 0; index < maxSize; index++) {
00119                 int exampleIndex = index;
00120                 if (exampleIndices != null) {
00121                     exampleIndex = exampleIndices[index];
00122                 }
00123
00124                 Example example = exampleSet.getExample(exampleIndex);
00125
00126                 resetNetwork();
00127
00128                 calculateValue(example);
00129
00130                 double weight = 1.0;
00131                 if (weightAttribute != null) {
00132                     weight = example.getValue(weightAttribute);
00133                 }
00134
00135                 double tempRate = learningRate * weight;
00136                 if (decay) {
00137                     tempRate /= cycle + 1;
00138                 }
00139
00140                 error += calculateError(example) / numberOfClasses * weight;
00141                 update(example, tempRate, momentum);
00142             }

I realized that the update function is called after each example, which also causes all network weights to be updated after one example has been seen? Is this correct? I would be very grateful if someone could confirm if I am right.
Regards
Logged
Pages: [1]
  Print  
 
Jump to: