Skip to Content

unstable weighted median code

I'm trying to translate a O(n) weighted median from c to R [so, as opposed to various other SO questions, here, we don't start from a sorted array]
--originally, i was doing a c++ implementation but my cpp code is
running unto the same problem as the R code--.

(As for the original code, see the /src/wgt_himed_templ.h file here)

Basically, the code takes a float vector a and computes it's weighted
median using the integer weights w.

My R code is sometimes running unto an infinite recursion problem
(see reproducible example below) and i'm not sure what the reason is
(note that we can assume that there are no duplicates in a).

whimed<-function(a,w,n,a_cand,a_srt,w_cand){
    w_tot=sum(w)
    found=FALSE
    wrest=0
    jj<-0
    while(!found){
        a_srt=a;
        n2=as.integer(n/2);
        #the sort() here is a shorthand. In reality, it's a partial sort
        trial=sort(a_srt)[n2] 
        print(trial)
        wleft=0
        wmid=0
        wright=0
        for(i in 0:(n-1)){
            if(a[i+1]<trial)        wleft=wleft+w[i+1]
            if(a[i+1]>trial)        wright=wright+w[i+1]
            if(abs(a[i+1]-trial)<1e-6)  wmid=wmid+w[i+1]
        }
        kcand=0;
        if(2*(wrest+wleft)>w_tot){
            for(i in 0:(n-1)){
                if(a[i+1]<trial){
                    a_cand[kcand+1]=a[i+1];
                    w_cand[kcand+1]=w[i+1];
                    kcand=kcand+1;
                }
            }
        } else {
            if(2*(wrest+wleft+wmid)<=w_tot){
                for(i in 0:(n-1)){
                    if(a[i+1]>trial){
                        a_cand[kcand+1]=a[i+1];
                        w_cand[kcand+1]=w[i+1];
                        kcand=kcand+1;
                    }
                }
                wrest=wrest+wleft+wmid;
            } else {
                retv=trial
                found=TRUE
            }
        }
        n=kcand;
        a=a_cand;
        w=w_cand;
        jj<-jj+1
        if(jj>100){
            found=TRUE; 
            retv<-trial;
            print("achtung")
        }
    }
    return(retv);
}

Example where it doesn't get stuck:

n<-100
set.seed(1)
a<-runif(n)
w<-sample(1:20,n,replace=TRUE)
w_cand<-rep(NA,n)
a_srt<-rep(NA,n)
a_cand<-rep(NA,n)   
whimed(a=a,w=w,n=n,a_cand=a_cand,a_srt=a_srt,w_cand=w_cand)

Example where it gets stuck:

n<-100
set.seed(2)
a<-runif(n)
w<-sample(1:20,n,replace=TRUE)
w_cand<-rep(NA,n)
a_srt<-rep(NA,n)
a_cand<-rep(NA,n)   
whimed(a=a,w=w,n=n,a_cand=a_cand,a_srt=a_srt,w_cand=w_cand)

I'm not really sure what the problem is :~. Anyone has an idea?