Friday, December 23, 2016

Expert Geoffrey Hinton

The first email:
Hello Mr. Hinton,

My name is Sergej Krivonos. Here is my linked page.
I want to use boost::mpl to implement compile-time neural network training algorithm.
I suppose it might speed up recognition jobs due to compile time and link time optimizations and reducing of multiplication by zero and precalculated values.
Recently I started learning your course on Coursera
Thus I have strong desire to design and implement compile-time trained recognition, I still might need some support. Please, let me know your thoughts as expert. 
The first answer:
There are 75,000 people taking my course.
Geoff
Next email:
Thank you much for answering. 
Is it worst to implement compile-time training? 
Is it possible to find a grant for this?
Maybe I shouldn't mention participating his course.

To be continued???
.
Oh Yeah...


Merry Christmass Geoffrey and The Happy New Year!

I present to you flat NN model:

Compile-time training result allows to precalculate weights and make each output to be equal a sum of (products of inputs and resulting output weights) - flat model.

Happy New Year Mr Hinton! Our kids will see adequate speaking robots!





2016-12-22 17:16 GMT+02:00 Сергей Кривонос <sergeikrivonos@gmail.com>:
Thank you much for answering.
Is it worst to implement compile-time training? 
Is it possible to find a grant for this?



--
Regards,
Sergei Krivonos
skype: sergio_krivonos

There are lots of questions about the math behind neural networks. And for machine learning too.
These pictures shown that in reality no need for more than two layers for neural networks that used in FANN library for example. But it only needed for current algorithm of learning. You can make two layer equivalent for any FANN net but to make it able to learn new algorithm of learning need to be developed. But seems it doable. I guess Geoffrey Hinton is the man who could develop such an algorithm. But would he be interested in such trivial networks that even not using sigmoids? Who knows... Only Geoffrey can answer such a question.
Maybe human cortex covers the function of such a "precompiled" flat but much faster neural network. And it uses electromagnetics for neuromediation. It is much faster. It might be updating while we sleeping. Neuroscientist might be interested to uncover this.
Another question is why outputs are dependent? To safe computer memory? Not actual for potential  compile-time algorithm. Each output gives a simple polynomial function by inputs.
Next question why such a simple function to produce? If extrapolate training data we can make much more precise function for every output much faster.
Lots of questions I see for the feature of Neural Networks. OpenMind is the library that would cover complete redesign for machine learning approach, contexts handling and goal-oriented reaching algorithm. It only needs some investment. Maybe some University or government grant.

2 comments:

  1. The reason why is terrific. It is, because all neural network simulations performs terrific amount of calculations currently. And this what delays adequate speaking robots.
    This article tells we can reduce runtime calculations dramatically if we'll make whole neural network training at compile time and will know its weights. Strictly saying, it is not necessary to make these calculations at compile time. But its result must be used at compile time to calculate flat model neural network.

    So example. Keep in mind that all variables named w-weights are of known value at compile time.

    Next layer neuron for example would be n43=w313*n31+w323*n32+w333*n33=

    =w313*(w211*n21+w221*n22+w231*n23)+
    +w323*(w212*n21+w222*n22+w232*n23)+

    +w333*(i1*(w213*w111+w223*w112+w233*w113)+
    +i2*(w213*w121+w223*w122+w233*w123)+
    +i3*(w213*w131+w223*w132+w233*w133) ) =

    =w313*( w211*(w111*i1+w121*i2+w131*i3) + w221*(w112*i1+w122*i2+w132*i3) +
    +w231*(w113*i1+w123*i2+w133*i3)
    ) +
    +w323*( w212*(w111*i1+w121*i2+w131*i3) +w222*(w112*i1+w122*i2+w132*i3) +
    +w232*(w113*i1+w123*i2+w133*i3) +

    + i1*w333*(w213*w111+w223*w112+w233*w113)+
    + i2*w333*(w213*w121+w223*w122+w233*w123)+
    + i3*w333*(w213*w131+w223*w132+w233*w133) =

    = i1*w111*w211*w313 + i2*w121*w211*w313 + i3*w131*w211*w313 +
    +i1*w112*w221*w313+ i2*w122*w221*w313 + i3*w132*w221*w313 +
    +i1*w113*w231*w313+ i2*w123*w231*w313 + i3*w133*w231*w313+

    + i1*w111*w212*w323 + i2*w121*w212*w323 + i3*w131*w212*w323 +
    + i1*w112*w222*w323+ i2*w123*w212*w323 + i3*w132*w222*w323 +
    + i1*w113*w232*w323+ i2*w123*w232*w323 + i3*w133*w232*w323 +

    +i1*(w213*w111*w333+w223*w112*w333+w233*w113*w333)+
    +i2*(w213*w121*w333+w223*w122*w333+w233*w123*w333)+
    +i3*(w213*w131*w333+w223*w132*w333+w233*w133*w333) ) =


    = i1*(w111*w211*w313 +w112*w221*w313+ w113*w231*w313+ w111*w212*w323 +
    +w112*w222*w323+ w113*w232*w323+ w213*w111*w333 +w223*w112*w333+
    +w233*w113*w333) +
    + i2*(w121*w211*w313+w122*w221*w313+w123*w231*w313+w121*w212*w323+
    +w123*w212*w323+w123*w232*w323+w213*w121*w333+w223*w122*w333+
    +w233*w123*w333)
    + i3*(w131*w211*w313 +w132*w221*w313 +w133*w231*w313+w131*w212*w323+
    +w132*w222*w323+w133*w232*w323+w213*w131*w333+w223*w132*w333+
    +w233*w133*w333) =

    = i1*weights_precalculated_1+i2*weights_precalculated_2+i3*weights_precalculated_3


    As all weights would be known at compile-time we'll be able to create super-fast neural network.

    This article also mentioned side effect. What we should pay with? We can not train flat model. We need to train full model first, then compile it to flat model (cortex) and the use its super fast adwantages. So we pay with time of converting to flat model. But its only if we need additional training on the go. Bunch of features does nod need for additional training on the go. Almost every AI program uses retrained networks!

    We can greatly speed up AI things with this information. It is terrific!

    Another point of the author is if we need retrained AI then we might be better to use extrapolation algorithms instead of simple polynom finding process which neural network defect is.

    ReplyDelete
  2. Still some questions like "how" may appear.. There is how https://github.com/iHateInventNames/openmind/blob/master/omnn/omnn.h

    ReplyDelete