function K=bow_kernel(S) %function K=bow_kernel(S) % % This function computes the bag of words kernel matrix for the strings in % S. The case of the letters is ignored. % %INPUTS % S = a cell array containing the strings % %OUTPUT % kernel = the bow kernel % % %For more info, see www.kernel-methods.net % %Author: Tijl De Bie, 25/02/2004, adapted (for speed-up) on 21/10/04. nS=length(S); K=zeros(nS,nS); disp('Build dictionary') dict={}; for k=1:nS s=S{k}; Swords_long=sort(lower(strread(s,'%s'))); [Swords{k},indi,indj]=unique(Swords_long); ns{k}=diff([0;indi]); dict=union(dict,Swords{k}); lengths(k)=length(Swords{k}); end disp('Compute kernel') ndict=length(dict); endind=ones(nS,1); for k=1:ndict inds=zeros(nS,1); d=dict{k}; inds=[]; vec=[]; for i=1:nS if endind(i)<=lengths(i) t=strcmp(Swords{i}{endind(i)},d); if t inds=[inds;i]; vec=[vec;ns{i}(endind(i))]; endind(i)=endind(i)+1; end end end K(inds,inds)=K(inds,inds)+vec*vec'; end