python 2.7 - Naive Bayes Classifier load saved picked results differ from train and test immediately -

i encountered same problem shown here. solution doesn't seem work me. not sure if help me it. thanks.

from sentimentanalyzer import tweettokenizer sentimentanalyzer import dataset import json import re import collections import nltk.metrics import nltk.classify import pickle  tweetstokenizer = tweettokenizer() featurelist = [] tweets = [] dataset = dataset() train_data = dataset.gettraindata() test_data = dataset.gettestdata()  def extract_features(tweet):     tweet_words = set(tweet)     features = {}     word in featurelist:         features['contains(%s)' % word] = (word in tweet_words)      homecoming features  trainsets = collections.defaultdict(set) testsets = collections.defaultdict(set) nbclassifier = none  train = true if train:     ... preprocessing codes above ...     # generate training set     print 'extracting features...'     training_set = nltk.classify.util.apply_features(extract_features, tweets)      # train naive bayes classifier     print 'training dataset...'     nbclassifier = nltk.naivebayesclassifier.train(training_set)      print 'saving model...'     f = open('naivebayesclassifier.pickle', 'wb')     pickle.dump(nbclassifier, f)     f.close() else:     f = open('naivebayesclassifier.pickle', 'rb')     nbclassifier = pickle.load(f)     f.close()  # test classifier print 'testing model...' i, line in enumerate(test_data):     tweetjson = json.loads(line)     labelledsentiment = dataset.gettestsentiment(tweetjson['id_str']).encode('utf-8')     trainsets[labelledsentiment].add(i)      testtweet = tweetjson['text'].encode('utf-8')     processedtesttweet = tweetstokenizer.preprocess(testtweet)     sentiment = nbclassifier.classify(extract_features(tweetstokenizer.getfeaturevector(processedtesttweet)))     testsets[sentiment].add(i)     print "testtweet = %s, classified sentiment = %s, labelled sentiment = %s\n" % (testtweet, sentiment, labelledsentiment)     # print "testtweet = %s, classified sentiment = %s, labelled sentiment = %s\n" % (testtweet, sentiment, labelledsentiment)  print 'positive precision:', nltk.metrics.precision(trainsets['positive'], testsets['positive']) print 'positive recall:', nltk.metrics.recall(trainsets['positive'], testsets['positive']) print 'positive f-measure:', nltk.metrics.f_measure(trainsets['positive'], testsets['positive']) print 'negative precision:', nltk.metrics.precision(trainsets['negative'], testsets['negative']) print 'negative recall:', nltk.metrics.recall(trainsets['negative'], testsets['negative']) print 'negative f-measure:', nltk.metrics.f_measure(trainsets['negative'], testsets['negative']) print 'neutral precision:', nltk.metrics.precision(trainsets['neutral'], testsets['neutral']) print 'neutral recall:', nltk.metrics.recall(trainsets['neutral'], testsets['neutral']) print 'neutral f-measure:', nltk.metrics.f_measure(trainsets['neutral'], testsets['neutral'])  print 'done'

the classifier when trained , tested gives different results compared classifier straight loaded without training. not figure out why. thanks.

python-2.7 nltk pickle

Search This Blog

Three

python 2.7 - Naive Bayes Classifier load saved picked results differ from train and test immediately -

Comments

Post a Comment

Popular posts from this blog

model view controller - MVC Rails Planning -

html - Submenu setup with jquery and effect 'fold' -

ruby on rails - Devise Logout Error in RoR -