当前位置:网站首页>Bert-whitening 向量降维及使用

Bert-whitening 向量降维及使用

2022-06-24 13:04:00 loong_XL

参考:https://kexue.fm/archives/8069
https://kexue.fm/archives/9079
https://zhuanlan.zhihu.com/p/531476789

输入:vv是多个向量组成的三维矩阵
在这里插入图片描述
结果:v_data1 256维度
在这里插入图片描述


def compute_kernel_bias(vecs, n_components=256):
    """计算kernel和bias
    vecs.shape = [num_samples, embedding_size],
    最后的变换:y = (x + bias).dot(kernel)
    """
    mu = vecs.mean(axis=0, keepdims=True)
    cov = np.cov(vecs.T)
    # print(cov)
    u, s, vh = np.linalg.svd(cov)
    print(np.diag(1 / np.sqrt(s) ))
    W = np.dot(u, np.diag(1 / np.sqrt(s)))
    return W[:, :n_components], -mu


def transform_and_normalize(vecs, kernel=None, bias=None):
    """ 最终向量标准化
    """
    if not (kernel is None or bias is None):
        vecs = (vecs + bias).dot(kernel)
    return vecs / (vecs**2).sum(axis=1, keepdims=True)**0.5


v_data = np.array(vv[0])    ## vv[0]多个向量组成的二维矩阵,如果输入一个向量的二维矩阵计算会报错

kernel,bias=compute_kernel_bias(v_data)
# print(kernel,bias)
v_data1=transform_and_normalize(v_data, kernel=kernel, bias=bias)

***线上单个向量就把上面整体计算出的kernel,bias用上,直接transform_and_normalize(v_data, kernel=kernel, bias=bias)就行


import numpy as np
data = np.random.rand(5,768)
print('data.shape = ')
print(data.shape,data)
def compute_kernel_bias(vecs):
    """计算kernel和bias
    vecs.shape = [num_samples, embedding_size],
    最后的变换:y = (x + bias).dot(kernel)
    """
    mu = vecs.mean(axis=0, keepdims=True)
    cov = np.cov(vecs.T)
    u, s, vh = np.linalg.svd(cov)
    W = np.dot(u, np.diag(1 / np.sqrt(s)))
    return W, -mu

def transform_and_normalize(vecs, kernel=None, bias=None):
    """应用变换,然后标准化
    """
    if not (kernel is None or bias is None):
        vecs = (vecs + bias).dot(kernel)
    return vecs / (vecs**2).sum(axis=1, keepdims=True)**0.5

kernel,bias = compute_kernel_bias(data)
kernel = kernel[:,:64]
print('kernel.shape = ')
print(kernel.shape)
print('bias.shape = ')
print(bias.shape)
data = transform_and_normalize(data, kernel, bias)
print('data.shape = ')
print(data.shape,data)

在这里插入图片描述
线上单个向量降维

data1 = np.random.rand(1,768)
data1_1 = transform_and_normalize(data1, kernel, bias)

在这里插入图片描述

原网站

版权声明
本文为[loong_XL]所创,转载请带上原文链接,感谢
https://blog.csdn.net/weixin_42357472/article/details/125441181

随机推荐