sysuNie

Pytorch

发表于 2018-01-16 | 更新于: 2018-01-19 | 分类于工具资源 | 热度: ℃

字数统计: 4,460 | 阅读时长 ≈ 21

Torch

torch包含了多维的数据结构以及基于其上的数学运算。它提供了多种实用工具，具有CUDA对应的实现

张量 Tensors

1	torch.is_tensor(obj)

a = torch.randn(3,4)
b = np.random.randn(3,4)
torch.is_tensor(a)
Out[1]: True
torch.is_tensor(b)
Out[2]: False

判断是否为张量，如果是pytorch张量，则返回True

阅读全文 »

cache_model

发表于 2018-01-15 | 更新于: 2018-01-15 | 分类于论文 | 热度: ℃

字数统计: 1,670 | 阅读时长 ≈ 6

ABSTRACT

我们提出对神经网络语言模型进行的一个扩展，以使其预测适应最近的历史。我们的模型是内存增强网络的简化版本。它将过去隐藏的激活存储为内存，并通过当前隐藏激活的点积访问它们。这种机制非常有效，可以扩展到非常大的内存大小。我们还在神经网络中使用外部存储器和基于计数的语言模型使用缓存模型之间建立了联系。我们在几个语言模型数据集上进行演示，我们的方法比最近的内存扩展网络性能更好。

INTRODUCTION

语言模型是单词序列的概率分布，具有许多应用，如machine translation，speech recognition 或dialogue agents。虽然传统的神经网络语言模型已经在这个领域获得了最先进的性能，但是它们缺乏适应其最近历史的能力，这限制了它们在动态环境中的应用。最近的解决这个问题的方法是用external memory来扩充这些网络。这些模型可能会使用外部存储器来存储新的信息并适应不断变化的环境。

虽然这些网络在语言建模数据集上取得了很好的结果，但它们在计算上相当昂贵。通常，他们必须学习一个可以参数化的机制来读取或写入存储单元。这可能会限制其可用内存的大小以及可以训练的数据量。在这项工作中，我们提出了一个非常轻量级的选择，它可以共享内存扩展网络的一些特性，特别是随着时间的推移动态调整的能力。通过最小化内存的计算负担，我们可以使用更大的内存并扩展到更大的数据集。我们在实践中观察到，这使我们能够在不同的语言建模任务上超越记忆增强网络的性能。

阅读全文 »

worldcloud

发表于 2018-01-13 | 更新于: 2018-01-27 | 分类于工具资源 | 热度: ℃

字数统计: 528 | 阅读时长 ≈ 2

　　今天介绍一个python库－－wordcloud，这个库的主要功能是对一个文本中的单词进行统计，并且以词云的方式进行展示，从生成的图片中，我们可以直观的发现哪些单词出现的频率较高，一个很有意思的用途就是统计一个会议中，提交论文的主题是哪些，从而看出当前研究的趋势。

快速安装词云

1	pip install wordcloud

快速生成词云

from wordcloud import WordCloud
f = open(u'moment.txt','r').read()
wordcloud = WordCloud(background_color="white",width=1000, height=860, margin=2).generate(f)
# width,height,margin可以设置图片属性
# generate 可以对全部文本进行自动分词,但是他对中文支持不好,对中文的分词处理请看我的下一篇文章
#wordcloud = WordCloud(font_path = r'~\Fonts\simkai.ttf').generate(f)
# 你可以通过font_path参数来设置字体集
#background_color参数为设置背景颜色,默认颜色为黑
import matplotlib.pyplot as plt
plt.imshow(wordcloud)
plt.axis("off")
plt.show()
wordcloud.to_file('test.png')
# 保存图片,但是在第三模块的例子中 图片大小将会按照 mask 保存

阅读全文 »

Exploring Word Vectors with GloVe

发表于 2018-01-12 | 更新于: 2018-01-12 | 分类于工具资源 | 热度: ℃

字数统计: 631 | 阅读时长 ≈ 3

在处理文字时，处理庞大但是稀少的语言是很困难的。即使对于一个晓得语料库，神经网络也需要支持数以千计的离散输入和输出。

除了原始数字外，将单词表示为one-hot向量的方法无法捕获任何有关单词之间关系的信息。

Ｗord Vector 通过在多维向量空间中表示单词来解决这个问题。这样就可以将问题的维度数十万减少到数百。而且向量空间能够从距离向量之间的夹角来捕获单词之间的语义关系。

analogy

现已有一些创建Ｗord Vector的技巧。word2vec算法可预测上下文中的单词(例如”the cat”最可能出现的单词是”the mouse”)，而Glove向量则基于整个语料库的全局计数。glove最大的特点就是可以轻松的下载多套预先训练好的词向量。

阅读全文 »

An Empirical Study of Language CNN for Image Captioning

发表于 2018-01-10 | 更新于: 2018-01-10 | 分类于论文 | 热度: ℃

字数统计: 819 | 阅读时长 ≈ 5

Abstract

In contrast to previous models which predict next word based on one previous word and hidden state, our language CNN is fed with all the previous words and can model the long-range dependencies in history words, which are critical for image captioning.

Introduction

Image captioning model should be capable of capturing implicit semantic information of an im-age and generating humanlike sentences. Most image captioning models follow the encoder-decoder pipeline.

Although models like LSTM networks have memory cells which aim to memorize history information for long-term, they are still limited to several time steps because long-term information is gradually diluted at every time step

To better model the hierarchical structure and long-term dependencies in word sequences we adopt a language CNN which applies temporal convolution to extract features from sequences.

To summarize, our primary contribution lies in incorporating a language CNN, which is capable of capturing long-range dependencies in sequences, with RNNs for image captioning.

阅读全文 »

Exploring the Limits of Language Modeling

发表于 2018-01-09 | 更新于: 2018-01-09 | 分类于论文 | 热度: ℃

字数统计: 1,721 | 阅读时长 ≈ 11

ABSTRACT

We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language. We perform an exhaustive study on techniques such as character Convolutional Neural Networks or Long-Short Term Memory, on the One Billion Word Benchmark.

Introduction

Models which can accurately place distributions over sentences not only encode complexities of language such as grammatical structure, but also distill a fair amount of information about the knowledge that a corpora may contain.(提取大量关于语料库可能包含的知识的信息).

Language Modeling can apply in speech recoginition, machine translation, text summarization etc. (such as word error rate for speech recognition, or BLEU score for translation).

When trained on vast amounts of data, language models compactly extract knowledge encoded in the training data. For example, when trained on movie subtitles, language models are able to generate basic answers to questions about object colors, facts about people, etc.

阅读全文 »

语言模型数据处理

发表于 2018-01-04 | 更新于: 2018-01-04 | 分类于工具资源 | 热度: ℃

字数统计: 373 | 阅读时长 ≈ 2

data = [("me gusta comer en la cafeteria".split(), "SPANISH"),
        ("Give it to me".split(), "ENGLISH"),
        ("No creo que sea una buena idea".split(), "SPANISH"),
        ("No it is not a good idea to get lost at sea".split(), "ENGLISH")]

out[]

[([‘me’, ‘gusta’, ‘comer’, ‘en’, ‘la’, ‘cafeteria’], ‘SPANISH’),
([‘Give’, ‘it’, ‘to’, ‘me’], ‘ENGLISH’),
([‘No’, ‘creo’, ‘que’, ‘sea’, ‘una’, ‘buena’, ‘idea’], ‘SPANISH’),
([‘No’, ‘it’, ‘is’, ‘not’, ‘a’, ‘good’, ‘idea’, ‘to’, ‘get’, ‘lost’, ‘at’, ‘sea’], ‘ENGLISH’)]

阅读全文 »

前向神经网络

发表于 2018-01-03 | 更新于: 2018-01-03 | 分类于工具资源 | 热度: ℃

字数统计: 1,578 | 阅读时长 ≈ 6

深度学习应用于自然语言处理-前向神经网络

引言

深度学习的浪潮袭来，现已在各个领域中应用。深度学习的表现令人叹为观止，不得不说我们迎来了依靠深度学习的人工智能学习时代。

　人工智能，让世界更美好。

这一章主要是通过学习 Stanford cs224d 课程已经阅读一些有些的博客所作出的总结。

阅读全文 »

栈和队列

发表于 2018-01-03 | 更新于: 2018-01-03 | 分类于阅读 | 热度: ℃

字数统计: 1,347 | 阅读时长 ≈ 6

栈和队列

栈

栈(stack)　限定仅在表尾进行插入或删除操作的线性表。

表尾端称为栈顶(top)，表头端称为栈底(bottom)。

stackfeature

后进先出的线性表。栈的抽象数据类型定义：

阅读全文 »

Array

发表于 2018-01-02 | 更新于: 2018-01-03 | 分类于阅读 | 热度: ℃

字数统计: 1,975 | 阅读时长 ≈ 9

线性表

类型定义

简言之，一个线性表是n个数据元素的有限序列。

在复杂的线性表中，一个数据元素可由若干个数据项(item)组成。在这种情况下，常把数据元素称为记录(record)，含有大量记录的线性表又称为文件（file）。

ADT List{
  //数据对象
  D = {a1|a2 belong to ElemSet, i=1,2,....,n, n>0}
  //数据关系
  R1 = {<ai-1,ai>|ai-1.ai belong to D,i = 1,2...n}
  //基本操作
  InitList(&L)
    //构造一个空的线性表Ｌ
  DestoryList(&L)
    //销毁线性表Ｌ
  ClearList(&L)
    //将Ｌ重置为空表
  ListEmpty(L)
    //若L为空表，返回True,否则False
  ListLength(L)
    //返回L中数据元素个数
  GetElem(L,i,&e)
    //用e返回L中第i个数据元素的值
  ListInsert(&L,i,e)
    //在Ｌ中第i个位置之前插入新的数据元素e，Ｌ的长度加1。
}

阅读全文 »