The dissolution of the subject
This post follows up on my previous analysis of >100K images of paintings. I thought it would be illustrative to try to make mean images for each year, to get an idea of trends over time in a more spatially accurate way.
Resampling
To be able to make an average image, the images needs to be of the same dimensions. This means I first have to resample all the images to a common format. Preferably larger than the largest one to not lose any information. Since the images are thumbnail and therefore small, resample them all into 200 by 200 pixels would be enough. The smaller the better in terms of processing speed for the upcoming steps of the analysis. The first steps are the same as in my previous blog post up till the part where I extract the values of the pixels in the images in a for-loop, ending up with a data frame with information about each image (called d1 in the code below). I won’t reiterate these steps here. To resample the images I loaded one image at a time in a for-loop, separated the R, G, and B matrices, and resampled them separately using nearest-neighbor interpolation. I then concatenated them back into one object, and saved the resampled image. To speed things up, I used the foreach package to parallelize the task.
library(raster) library(readbitmap) library(abind) library(png) library(foreach) library(doSNOW) cl=makeCluster(4) registerDoSNOW(cl) newproj="+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0" foreach(i=1:length(d1$file), .packages=c("abind", "raster", "readbitmap", "png")) %dopar% { filename=as.character(d1$file[i]) original.image=read.bitmap(filename) Red=raster(original.image[,,1]) Green=raster(original.image[,,2]) Blue=raster(original.image[,,3]) projection(Red)=newproj;projection(Green)=newproj;projection(Blue)=newproj Red=resample(Red, raster(nrow=200, ncol=200, xmn=0, xmx=1, ymn=0, ymx=1), method="ngb") Green=resample(Green, raster(nrow=200, ncol=200, xmn=0, xmx=1, ymn=0, ymx=1), method="ngb") Blue=resample(Blue, raster(nrow=200, ncol=200, xmn=0, xmx=1, ymn=0, ymx=1), method="ngb") Red=t(matrix(Red@data@values, ncol=200)) Green=t(matrix(Green@data@values, ncol=200)) Blue=t(matrix(Blue@data@values, ncol=200)) resampled.image <- abind(Red, Green, Blue, along=3) writePNG(resampled.image, target = gsub("data_bbc", "data_bbc_resampled", filename)) }
The resampled images are now all 200 by 200 pixels. This is one example of this:
Averaging
To get an average, I loaded the images in RGB color space and simply took the average values of each element (pixel) in all the matrices (one matrix for each image), separately for the R, G, and B values. I then put the R, G and B matrices together into an image. I did this separately for each year.
d1$file=gsub("data_bbc", "data_bbc_resampled", d1$file) missings=setdiff(d1$file, list.files("C:/Art/data_bbc_resampled", full.name=T)) d1=d1[!(d1$file %in% missings),] library(readbitmap) library(abind) library(png) for (j in match(1800:1999, sort.years)){ pics=as.character(d1$file[d1$year==sort.years[j]]) npics=length(pics) imageFiles=lapply(pics, read.bitmap) comb=abind(imageFiles, along=4) meanImg=array(NA, c(200,200,3)) meanImg[,,1]=apply(comb[,,1,],c(1,2), FUN=mean) meanImg[,,2]=apply(comb[,,2,],c(1,2), FUN=mean) meanImg[,,3]=apply(comb[,,3,],c(1,2), FUN=mean) writePNG(meanImg, target = paste("C:/Art/bbc_mean_rgb/mean_", sort.years[j], ".png", sep="")) cat("created mean image for year",sort.years[j],"\n") }
And here’s the resulting images ordered next to each other:
Especially in the older paintings, you get this spooky average portrait. Like this one from 1803:
They seems to blur over time, when portraits are a smaller part of the total number of paintings. Portraits might also differ more among each other in later decades compared to earlier, where almost all subjects seems to be in the exact same spot.
Just changing the function used to aggregate pixel color values from mean to median, which might be a bit more robust, gave a similar result but with a bit brighter colors:
Frame artifacts
There are circles around the “head” in many of the mean images, especially the older ones. This is probably because some of the paintings are round. Like this one for example:
Realizing this, I thought this might me a problem for the results in my earlier blog post, maybe the whole trend of paintings getting less orange is just an effect of less round paintings that introduce other color artifacts. Although the problem might be small as most round paintings in the data base seems to be just surrounded by white, like this one:
This would exclude those pixels from that analysis since I thresholded pixels to have a saturation higher than 0.2.
To investigate if the frames were responsible for the color changes, I redid the earlier analysis on the resampled images. I divided the images into 16 squares like this:
and did a separate analysis for the different locations in the images.
library(readbitmap) library(colorspace) library(plotrix) sort.years=sort(unique(d1$year)) all.H01=list(); all.H02=list(); all.H03=list(); all.H04=list(); all.H05=list(); all.H06=list(); all.H07=list(); all.H08=list(); all.H09=list(); all.H10=list(); all.H11=list(); all.H12=list(); all.H13=list(); all.H14=list(); all.H15=list(); all.H16=list(); excl.dim1=NULL excl.dim2=NULL px.sample=50 m01=as.logical(c(rep(rep(c(1,0,0,0), each=50),50),rep(0,30000))) m02=as.logical(c(rep(rep(c(0,1,0,0), each=50),50),rep(0,30000))) m03=as.logical(c(rep(rep(c(0,0,1,0), each=50),50),rep(0,30000))) m04=as.logical(c(rep(rep(c(0,0,0,1), each=50),50),rep(0,30000))) m05=as.logical(c(rep(0,10000),rep(rep(c(1,0,0,0), each=50),50),rep(0,20000))) m06=as.logical(c(rep(0,10000),rep(rep(c(0,1,0,0), each=50),50),rep(0,20000))) m07=as.logical(c(rep(0,10000),rep(rep(c(0,0,1,0), each=50),50),rep(0,20000))) m08=as.logical(c(rep(0,10000),rep(rep(c(0,0,0,1), each=50),50),rep(0,20000))) m09=as.logical(c(rep(0,20000),rep(rep(c(1,0,0,0), each=50),50),rep(0,10000))) m10=as.logical(c(rep(0,20000),rep(rep(c(0,1,0,0), each=50),50),rep(0,10000))) m11=as.logical(c(rep(0,20000),rep(rep(c(0,0,1,0), each=50),50),rep(0,10000))) m12=as.logical(c(rep(0,20000),rep(rep(c(0,0,0,1), each=50),50),rep(0,10000))) m13=as.logical(c(rep(0,30000),rep(rep(c(1,0,0,0), each=50),50))) m14=as.logical(c(rep(0,30000),rep(rep(c(0,1,0,0), each=50),50))) m15=as.logical(c(rep(0,30000),rep(rep(c(0,0,1,0), each=50),50))) m16=as.logical(c(rep(0,30000),rep(rep(c(0,0,0,1), each=50),50))) for (j in match(1800:2000,sort.years)){ pics=as.character(d1$file[d1$year==sort.years[j]]) npics=length(pics) imageFiles=lapply(pics, read.bitmap) excltemp1=sum(sapply(1:length(imageFiles), function(x) length(dim(imageFiles[[x]]))!=3 )) imageFiles=imageFiles[sapply(1:length(imageFiles), function(x) length(dim(imageFiles[[x]]))==3 )] excltemp2=sum(sapply(1:length(imageFiles), function(x) dim(imageFiles[[x]])[3]!=3)) imageFiles=imageFiles[sapply(1:length(imageFiles), function(x) dim(imageFiles[[x]])[3]==3)] excl.dim1=append(excl.dim1, excltemp1) excl.dim2=append(excl.dim2, excltemp2) H=imageFiles H01=imageFiles; H02=imageFiles; H03=imageFiles; H04=imageFiles; H05=imageFiles; H06=imageFiles; H07=imageFiles; H08=imageFiles; H09=imageFiles; H10=imageFiles; H11=imageFiles; H12=imageFiles; H13=imageFiles; H14=imageFiles; H15=imageFiles; H16=imageFiles; for (i in 1:length(imageFiles)){ H[[i]]=as(RGB(as.vector(H[[i]][,,1]),as.vector(H[[i]][,,2]),as.vector(H[[i]][,,3])), "HSV") H01[[i]]=H[[i]]@coords[,1][m01&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H02[[i]]=H[[i]]@coords[,1][m02&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H03[[i]]=H[[i]]@coords[,1][m03&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H04[[i]]=H[[i]]@coords[,1][m04&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H05[[i]]=H[[i]]@coords[,1][m05&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H06[[i]]=H[[i]]@coords[,1][m06&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H07[[i]]=H[[i]]@coords[,1][m07&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H08[[i]]=H[[i]]@coords[,1][m08&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H09[[i]]=H[[i]]@coords[,1][m09&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H10[[i]]=H[[i]]@coords[,1][m10&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H11[[i]]=H[[i]]@coords[,1][m11&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H12[[i]]=H[[i]]@coords[,1][m12&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H13[[i]]=H[[i]]@coords[,1][m13&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H14[[i]]=H[[i]]@coords[,1][m14&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H15[[i]]=H[[i]]@coords[,1][m15&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] H16[[i]]=H[[i]]@coords[,1][m16&(H[[i]]@coords[,2]>0.2)&(H[[i]]@coords[,3]>0.2)] if(length(H01[[i]])<px.sample){H01[[i]]<-NA} if(length(H01[[i]])>=px.sample){H01[[i]]<-sample(H01[[i]], size=px.sample, replace = FALSE)} if(length(H02[[i]])<px.sample){H02[[i]]<-NA} if(length(H02[[i]])>=px.sample){H02[[i]]<-sample(H02[[i]], size=px.sample, replace = FALSE)} if(length(H03[[i]])<px.sample){H03[[i]]<-NA} if(length(H03[[i]])>=px.sample){H03[[i]]<-sample(H03[[i]], size=px.sample, replace = FALSE)} if(length(H04[[i]])<px.sample){H04[[i]]<-NA} if(length(H04[[i]])>=px.sample){H04[[i]]<-sample(H04[[i]], size=px.sample, replace = FALSE)} if(length(H05[[i]])<px.sample){H05[[i]]<-NA} if(length(H05[[i]])>=px.sample){H05[[i]]<-sample(H05[[i]], size=px.sample, replace = FALSE)} if(length(H06[[i]])<px.sample){H06[[i]]<-NA} if(length(H06[[i]])>=px.sample){H06[[i]]<-sample(H06[[i]], size=px.sample, replace = FALSE)} if(length(H07[[i]])<px.sample){H07[[i]]<-NA} if(length(H07[[i]])>=px.sample){H07[[i]]<-sample(H07[[i]], size=px.sample, replace = FALSE)} if(length(H08[[i]])<px.sample){H08[[i]]<-NA} if(length(H08[[i]])>=px.sample){H08[[i]]<-sample(H08[[i]], size=px.sample, replace = FALSE)} if(length(H09[[i]])<px.sample){H09[[i]]<-NA} if(length(H09[[i]])>=px.sample){H09[[i]]<-sample(H09[[i]], size=px.sample, replace = FALSE)} if(length(H10[[i]])<px.sample){H10[[i]]<-NA} if(length(H10[[i]])>=px.sample){H10[[i]]<-sample(H10[[i]], size=px.sample, replace = FALSE)} if(length(H11[[i]])<px.sample){H11[[i]]<-NA} if(length(H11[[i]])>=px.sample){H11[[i]]<-sample(H11[[i]], size=px.sample, replace = FALSE)} if(length(H12[[i]])<px.sample){H12[[i]]<-NA} if(length(H12[[i]])>=px.sample){H12[[i]]<-sample(H12[[i]], size=px.sample, replace = FALSE)} if(length(H13[[i]])<px.sample){H13[[i]]<-NA} if(length(H13[[i]])>=px.sample){H13[[i]]<-sample(H13[[i]], size=px.sample, replace = FALSE)} if(length(H14[[i]])<px.sample){H14[[i]]<-NA} if(length(H14[[i]])>=px.sample){H14[[i]]<-sample(H14[[i]], size=px.sample, replace = FALSE)} if(length(H15[[i]])<px.sample){H15[[i]]<-NA} if(length(H15[[i]])>=px.sample){H15[[i]]<-sample(H15[[i]], size=px.sample, replace = FALSE)} if(length(H16[[i]])<px.sample){H16[[i]]<-NA} if(length(H16[[i]])>=px.sample){H16[[i]]<-sample(H16[[i]], size=px.sample, replace = FALSE)} } all.H01[[j]]=unlist(H01); all.H02[[j]]=unlist(H02); all.H03[[j]]=unlist(H03); all.H04[[j]]=unlist(H04); all.H05[[j]]=unlist(H05); all.H06[[j]]=unlist(H06); all.H07[[j]]=unlist(H07); all.H08[[j]]=unlist(H08); all.H09[[j]]=unlist(H09); all.H10[[j]]=unlist(H10); all.H11[[j]]=unlist(H11); all.H12[[j]]=unlist(H12); all.H13[[j]]=unlist(H13); all.H14[[j]]=unlist(H14); all.H15[[j]]=unlist(H15); all.H16[[j]]=unlist(H16); cat("processed year", sort.years[j], "\n") } allM=list(h01=all.H01, h02=all.H02, h03=all.H03, h04=all.H04, h05=all.H05, h06=all.H06, h07=all.H07, h08=all.H08, h09=all.H09, h10=all.H10, h11=all.H11, h12=all.H12, h13=all.H13, h14=all.H14, h15=all.H15, h16=all.H16) nr.images=NULL for (squares in 1:16){ all.H=allM[[squares]] all.H=tail(all.H, 201) na.H=sapply(all.H, function(x) sum(is.na(x))) sum(na.H) sum(excl.dim1) sum(excl.dim2) all.H=lapply(all.H, na.exclude) npics=sapply(all.H, length)/px.sample ### PLOT res_out=1024; w_out=8; h_out=8; psize_out=12; png(filename=paste("C:/Art/bbc_spatial/bbc_1800_2000_square_m",sprintf("%02d", squares),".png"), units="in", width=w_out, height=h_out, pointsize=psize_out, res=res_out, bg="transparent") par("mar" = c(0, 0, 0, 0)) par("mgp" = c(0,0,0)) plot(1:(length(1800:2000)+1), type="n", bty="n", xaxt="n", yaxt="n", xaxs="i", yaxs="i", xlab="", ylab="") for (k in 1:201) { if(length(all.H[[k]])!=0){ tempc=sort(all.H[[k]]) tempc=c(tempc[tempc>280],tempc[tempc<=280]) gradient.rect(k,1,k+1,length(1800:2000)+1,col=hex(HSV(tempc,0.8,0.8)),gradient="y",border=NA) } } dev.off() nr.images=append(nr.images, sum(sapply(all.H, length)/px.sample)) }
And this is the result:
As you can see from the result, the frames that were included in some pictures do not seem to affect the overall conclusion. The distribution of colors is roughly the same in different parts of the paintings. More blue in the upper parts, which makes sense if there is a few landscapes in there with blue skies. The great number of portraits among the older paintings might be one reason why the orange were more common in older paintings.
Who needs to have their portrait painted when you can just take a selfie?





