最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python 3.x - Tensorflow is unable to train to predict simple multiplication - Stack Overflow

programmeradmin1浏览0评论

As a starting point i am trying to create a neural network that predicts simple multiplication, with the goal to change the formula at the later date. My initial thought was that it would be trivial to do, and the code itself is quite simple, but the model is not trainable at all. It learns nearly nothing, and its predictions are not even beginning to approach anything reasonable.

The code is

import tensorflow as tf

import numpy as np
import random

import os.path


def compileModel():
    model = tf.keras.Sequential()
    model.add( tf.keras.layers.InputLayer(shape=(2,)) )
    model.add( tf.keras.layers.Dense(1024) )
    model.add( tf.keras.layers.Dense(1024) )
    model.add( tf.keras.layers.Dense(units=1) )
    
    modelpile(loss="mean_squared_error", optimizer="adam", metrics=["mse"])
    
    return model

def generateData(num):
    vals = np.zeros((num, 2))
    res = np.zeros(num)
    for i in range(num):
        a = random.uniform(-100,100)
        b = random.uniform(-100,100)
        c = a*b
        
        vals[i][0] = a
        vals[i][1] = b
        res[i] = c
    return (vals, res)


modelFilename = 'saved_model.keras'
runTraining = False

if not runTraining and not os.path.isfile(modelFilename):
    runTraining = True

if runTraining:
    trainData, trainRes = generateData(100000)
    
    model = compileModel()
    
    model.fit(x=trainData, y=trainRes, epochs=12, batch_size=100)
    model.save(filepath=modelFilename)
else:
    model = tf.keras.models.load_model(modelFilename)


testData, testRes = generateData(1000)
model.evaluate(x=testData, y=testRes)

print(testData.shape)

res = model.predict(testData[0:1])
print(testData[0:1])
print(testRes[0:1])
print(res)

In here i am generating two numbers from -100 to 100 as the input and simply multiply them to get the correct answer. Everything runs, and it appears to go through epochs, but then nothing useful is predicted.

My guess would be that i have made a mistake in setting up the model itself. Here i am using two dense layers with 1024 connection points. I have tried playing with the number of layers and with the number of connections, but all that does is increase or decrease the time it takes for the model to train.

As a starting point i am trying to create a neural network that predicts simple multiplication, with the goal to change the formula at the later date. My initial thought was that it would be trivial to do, and the code itself is quite simple, but the model is not trainable at all. It learns nearly nothing, and its predictions are not even beginning to approach anything reasonable.

The code is

import tensorflow as tf

import numpy as np
import random

import os.path


def compileModel():
    model = tf.keras.Sequential()
    model.add( tf.keras.layers.InputLayer(shape=(2,)) )
    model.add( tf.keras.layers.Dense(1024) )
    model.add( tf.keras.layers.Dense(1024) )
    model.add( tf.keras.layers.Dense(units=1) )
    
    model.compile(loss="mean_squared_error", optimizer="adam", metrics=["mse"])
    
    return model

def generateData(num):
    vals = np.zeros((num, 2))
    res = np.zeros(num)
    for i in range(num):
        a = random.uniform(-100,100)
        b = random.uniform(-100,100)
        c = a*b
        
        vals[i][0] = a
        vals[i][1] = b
        res[i] = c
    return (vals, res)


modelFilename = 'saved_model.keras'
runTraining = False

if not runTraining and not os.path.isfile(modelFilename):
    runTraining = True

if runTraining:
    trainData, trainRes = generateData(100000)
    
    model = compileModel()
    
    model.fit(x=trainData, y=trainRes, epochs=12, batch_size=100)
    model.save(filepath=modelFilename)
else:
    model = tf.keras.models.load_model(modelFilename)


testData, testRes = generateData(1000)
model.evaluate(x=testData, y=testRes)

print(testData.shape)

res = model.predict(testData[0:1])
print(testData[0:1])
print(testRes[0:1])
print(res)

In here i am generating two numbers from -100 to 100 as the input and simply multiply them to get the correct answer. Everything runs, and it appears to go through epochs, but then nothing useful is predicted.

My guess would be that i have made a mistake in setting up the model itself. Here i am using two dense layers with 1024 connection points. I have tried playing with the number of layers and with the number of connections, but all that does is increase or decrease the time it takes for the model to train.

Share Improve this question asked Jan 19 at 6:12 v010dyav010dya 5,8487 gold badges30 silver badges50 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

I think part of what might really be hurting you here is the wide range inputs, these models seem to perform best when values are in the neighborhood of 0 to 1 in my experience. Adding a normalization to the input and denormalization to the output may help quite a bit.

Another factor that is likely hurting the performance is the activation function on your dense layers is linear, which isn't great. I reran your same code with the activation function set to 'relu' for your dense layers and got approx 1% error on the output.

Here is the code I tested with (which has only minimal alterations):

import tensorflow as tf

import numpy as np
import random

import os.path


def compileModel():
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.InputLayer(shape=(2,)))
    model.add(tf.keras.layers.Dense(1024, activation='relu'))
    model.add(tf.keras.layers.Dense(1024, activation='relu'))
    model.add(tf.keras.layers.Dense(units=1))

    model.compile(loss="mean_squared_error", optimizer="adam", metrics=["mse"])

    return model


def generateData(num):
    vals = np.zeros((num, 2))
    res = np.zeros(num)
    for i in range(num):
        a = random.uniform(-100, 100)
        b = random.uniform(-100, 100)
        c = a * b

        vals[i][0] = a
        vals[i][1] = b
        res[i] = c
    return (vals, res)


modelFilename = 'saved_model.keras'
runTraining = False

if not runTraining and not os.path.isfile(modelFilename):
    runTraining = True

if runTraining:
    trainData, trainRes = generateData(100000)

    model = compileModel()

    model.fit(x=trainData, y=trainRes, epochs=12, batch_size=100)
    model.save(filepath=modelFilename)
else:
    model = tf.keras.models.load_model(modelFilename)

testData, testRes = generateData(1000)
model.evaluate(x=testData, y=testRes)

print(testData.shape)

res = model.predict(testData[0:1])
print(testData[0:1])
print(testRes[0:1])
print(res)

And here is the output it generated:

Epoch 1/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 4ms/step - loss: 1773706.0000 - mse: 1773706.1250
Epoch 2/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 409373.8438 - mse: 409373.8438
Epoch 3/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 115835.0000 - mse: 115835.0000
Epoch 4/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 13953.7402 - mse: 13953.7402
Epoch 5/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 4830.2773 - mse: 4830.2773
Epoch 6/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 2523.4836 - mse: 2523.4836
Epoch 7/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 4s 4ms/step - loss: 1607.8860 - mse: 1607.8860
Epoch 8/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 1378.7347 - mse: 1378.7347
Epoch 9/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 1488.6558 - mse: 1488.6558
Epoch 10/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 1009.2731 - mse: 1009.2731
Epoch 11/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 4s 4ms/step - loss: 1667.1349 - mse: 1667.1349
Epoch 12/12
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 5s 5ms/step - loss: 863.3384 - mse: 863.3384
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 621.0710 - mse: 621.0710 
(1000, 2)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
[[ 76.60049687 -70.62534063]]
[-5409.93618401]
[[-5450.9204]]

And it's worth noticing, the model is still making good strides in improving the error with each epoch (epoch 12 halved the error over the previous epoch), so the approximation should improve more with more epochs as the model finishes converging.

发布评论

评论列表(0)

  1. 暂无评论