Search for:
malling centenary strawberry imaging
Smartphone spectral leaf imaging
malling centenary strawberry imaging
Strawberry leaf images and Red Green Blue indices derived using OpenCV image analysis

All season we have been monitoring the health of our Strawberry Greenhouse crop. In addition to visual inspection with a loupe, digital leaf imaging has been a useful way to follow the development of the plants. Now that autumn has arrived, the leaves are showing the usual spectacular colour changes.

Chlorophyll green quickly gives way to yellowing and eventual reddening of the leaves. The yellow colour pigments are usually present in the leaves but as chlorophyll production is much less stimulated by sunshine, the yellow pigments can be seen. Red pigments are increasingly produced in leaves as the sugar concentration in the leaves increases. Leaf yellows are due to carotenoid pigments and leaf red colours are due to anthocyanin pigments.

Chlorophyll molecules absorb both red and blue light, leaving green visible light to be reflected by plant leaves, making them appear green. Carotenoid molecules  on the other hand absorb light in the blue end of the spectrum making leaves reflect and scatter yellow, green and red light. Anthocyanin molecules absorb blue and green light, making leaves reflect and scatter deeper red light.

leaf pigments spectra
Absorption spectra of chlorophyll a (blue), chlorophyll b (green), carotinoid type (orange), anthocyanin type (red) (Courtesy SPIE journals Creative Commons and Universidad de Guadalajara, Mexico)

Spectrometers and hyperspectral imaging devices can measure the colours of light reflected by leaves with high precision (1 nm or better). As the figure above from researchers at the Universidad de Guadalajara in Mexico shows, the light absorbed by plant pigments is absorbed in wavelength bands much wider than 1nm. Modern smartphones typically have cameras with many megapixels, giving images of astonishing resolution. The detector chips also have RGB elements capable of measuring red, green and blue light. The different red, green and blue pixels do not have high spectral resolution, their sensitivity curves are rather broad. However the red, green and blue sensitivity curves   are quite comparable to the widths of the absorption curves of the three main leaf pigments. The figure below shows typical spectral sensitivity curves for an Android phone.


Smartphone camera RGB sensitivities (Courtesy of Optical Society of America Open Access agreement)

Red camera pixels preferentially detect light attenuated by chlorophyll and green pixels preferentially detect light attenuated by anthocyanins. Blue pixels are not so discriminating, they detect light attenuated by chlorophyll, carotenoids and to a lesser extent anthocyanins.

It should therefore be possible to construct an index or set of indices which relates at least qualitatively to the amounts of leaf pigments in strawberry plant leaves. However this is not straightforward, the red channel is the only colour which measures the influence of just one pigment, chlorophyll. Red channel values cannot be simply used directly because lighting intensity varies during the course of a day and the sky and sun colours also change subtly. Changes in illumination (colours and intensity) can be normalised to some extent using a reference background for leaf imaging. A uniform black coloured background provides a useful contrast to the leaves themselves as well as a reference for red, green and blue intensities measured by the smartphone camera.

The figure at the top of this blog shows a spreadsheet with three normalised colour ratios:

R/(R+G+B); G/(R+G+B); B/(R+G+B)

Each index has a good relation to the redness/greenness/blueness of strawberry leaves growing in our Strawberry Greenhouse.

A short Python script was written to mask just the leaf pixels and sum the intensities of the red, green and blue pixel values for each leaf. Red, green and blue ratios were then calculated to give numbers independent of illumination intensity. One of the great things about using Python is the ease of programming and the large number of Python library modules freely available. Image analysis was carried out on JPG files using the OpenCV open source computer vision library.

Further work is ongoing to develop image models that reflect the presence of chlorophyll, carotene and anthocyanin pigments more specifically. Ultimately it may be possible to relate changes in pigmentation not only to senescence but also to nutrient levels and disease susceptibility.


Leaf spectroscopy research from the Universidad de Guadalajara in Mexico has been reported here.

OpenCV project library and documentation can be found here.

Read more about our Strawberry Greenhouse project here.

french bean leaves
Mineral deficiency in bean leaves classified by multispectral imaging
French bean Phaseolus Vulgaris (Image courtesy of WikiMedia Creative Commons)

Plants tell us when they are lacking vital nutrients but we can’t always hear what they are saying. Nitrogen, phosphorous and potassium (N, P, K) are well know macronutrients and the appearance of plants lacking any one of them is also well known. Plants lacking nitrogen have small leaves and stunted growth, those lacking phosphorous have poorly developed root systems and plants deficient in potassium fail to flower well.

Micronutrients, including magnesium, boron and iron (Mg, B and Fe) also affect the way plants grow and function but act together with N P and K in ways that can be complicated. This means that chemical analysis is required to identify which element is lacking and how much should be added to a growing crop or to a field before sowing seed. Quantifying the elements present in plant material is straightforward to do in the lab but time consuming. If analysis is prompted by how the plant looks to the eye, it is also too late to correct a nutrient problem. To be useful, micronutrient analysis must be carried out at an early stage in plant development.

Researchers from the Department of Plant Development at the University of Zagreb recently reported that multispectral imaging of plant leaves can be a quick, early and non-destructive way to classify nutrient deficiency in young bean plants. Writing in the latest edition of the journal Frontiers in Plant Science, Boris Lazarevic and team described how multispectral imaging of french bean leaves can be used to distiguish normal healthy plants from those lacking nitrogen, phosphorous, potassium, magnesium or iron. Just three days after introducing nutrient deficient conditions, multispectral imaging correctly classified 92% of bean plants suffering from deficiency. After twelve days, 100% of bean plants could be correctly classified as healthy or deficient in N P K Mg and Fe.

How did they achieve this?

PlantExplorer multispectral leaf imager (image courtesy of PhenoVation)

The team from Zagreb used an instrument similar to the PlantExplorer (shown above) to image juvenile leaves of french bean plants in containerised trays. Each tray contained plants growing in hydroponic media and was imaged at 3, 6, 9 and 12 days after the introduction of test solutions. A control solution with a cocktail of standard macro and micronutrients was the basis for the other nutrient deficient test solutions. Individual trays were grown with solutions lacking N, P, K, Mg and Fe components.

A phenotype or set of physical characteristics was used to identify potential changes in the leaves resulting from growing in solutions deficient in each mineral. These spectral parameters were either the reflectance of the leaves at different wavelengths (640, 550, 475, 510-590, 730, 769, 710 nm) or parameters (eg. green leaf index GLI, chlorophyll index CHI, anthocyanin index ARI, hue, saturation and intensity) derived from the images.

So far so good but how to extract useful information from the image data and how to evaluate the information? Lazarevic and team chose a statistical method known as linear discriminant analysis (LDA). LDA is a powerful way to use parameters or combinations of parameters that group together data from one set of plants and distinguish that set of plants from other sets of plants. In the case of the mineral deficiency study, the sets represented plants in each tray.

Decision tree for classification of plant leaf images based on multispectral parameters (Image courtesy of Frontiers in Plant Science Journal and the University of Zagreb)

The figure above shows how multispectral parameters were used to classify plant leaf images. Each three day timepoint is denoted by MT1, MT2, MT3, MT4. It is interesting to note that different discrimination criteria were used for different measurement dates. Different colours represent the different missing minerals. After 12 days, using LDA, it was possible to correctly classify virtually all the plant images into control, N P K Mg and Fe deficient groups. After just 3 days (MT1) most of the plant images were correctly classified but not with the same criteria as those used on other dates.

In addition to multispectral measurements on the plant trays, the Zagreb group also evaluated chlorophyll fluorescence and morphological measurements as potential techniques for mineral deficiency classification. Chlorophyll fluorescence is of interest because it can reveal levels of plant activitity or function. Morphological measurements, such as plant height, have long been used by farmers to check the progress of crops.

However neither method was as successful as multispectral imaging in classifying mineral deficiency. The paper from the Department of Plant Development at the University of Zagreb reveals that multispectral imaging can be used to classify different mineral deficiencies in plants. Consequences of mineral deficiency can be detected after only three days but the fact that each measurement date requires a different set of classification criteria suggests that the methods tested are not yet robust enough to use as generic measures of mineral deficiency.

It will be fascinating to see how far the multispectral imaging methods can be developed into routine diagnostic techniques for farmers.

The full Journal of Plant Science article can be found here.

Information about the BACO multispectral imaging instrument available from Corbeau Innovation can be found here.

espresso coffee
Coffee leaf miner infection located by multispectral imaging
espresso coffee
espresso coffee

There’s an awful lot of coffee in Brazil! according to the old song recorded by Frank Sinatra. In 2021 almost 70 million 60kg bags of coffee were produced in Brazil, roughly one third of global production, making Brazil the largest coffee producer in the world. Yield varies from year to year depending mainly on the weather but also on disease risks.

Coffee leaf miner infection is a major cause of poor cropping and ultimately plant death. These little critters are the larvae of a tiny moth that lays its eggs on the surface of coffee plant leaves. They munch their way into leaves leaving black holes filled with (ahem) poop, which create the impression of tiny mine shafts in the leaf.

Researchers at the Federal University of Lavras in Brazil have demonstrated a new way to spot the effects of disease in coffee plants using a drone equipped with a multispectral imaging camera.

coffee plant
Healthy coffee plant (courtesy WikiMedia Commons)

They flew the drone over a coffee plantation in the Minas Gerais region of Brazil. At a height of 3m it was possible to record images of individual coffee leaves on different Coffea arabica L. plants, images being taken with a multispectral camera. Four wavebands were used for the imaging: 530-570 nm  (green); 640-680 nm (red); 730-740 nm (red edge); and 770-810 nm (near infrared). Similar images were recorded manually of the leaves of healthy coffee plants of the same species in a greenhouse for comparison.

drone and multispectral camera
(a) Quadcopter drone and (b) multispectral camera used in plantation (courtesy AgriEngineering Journal and University of Lavras)

Images from any single camera waveband are subject to variations in sunlight and shadowing therefore researchers have developed a large number of different so-called Vegetation Indexes (VIs) to allow comparison of measurements on different days and at different locations. The normal method is to use a ratio of a camera waveband that changes a lot against a waveband that does not typically change very much. Possibly the most popular vegetation index is NDVI, Normalised Difference Vegetation Index:

NDVI = (NearInfrared – Red) / (NearInfrared + Red)

Plants absorb red and blue light strongly, reflecting green and near infrared therefore they have relatively high NDVI values (0.8-1.0). Basically the higher the NDVI value for a plant leaf, the greener and healthier it is. 

The team at Lavras wanted to discover which VI was the most effective at distinguishing healthy coffee leaves from leaves infected by coffee leaf miner. They reasoned that because infected leaves are generally darker due to the poop tracks, healthy  leaves should typically have higher VI values.

coffee plant vegetation index images
Coffee plant vegetation index images: (A, C) Healthy in greenhouse; (B, D, E) infected in plantation (courtesy AgriEngineering Journal and University of Lavras)

Comparison of the average difference in VI values and their distribution across many leaves showed that the GRNDVI index gave the best differentiation between healthy and diseased coffee leaves. Accounting for Green and Red variations between healthy and diseased leaves gave ratios of 0.32 and 0.06 for healthy and leaf miner infected leaves respectively. These values are lower than the NDVI values measured but give a much larger difference, allowing better differentiation.

Coffee farmers could benefit from quicker identification and location of leaf miner disease in their plantations if this research can be transferred to a commercial product. Coffee drinkers the world over could benefit from more sustainable farming and more stable pricing for their favourite brew.

Read the full journal paper here.

Find out about Vegetation Indexes here.

Find out more about coffee production in Brazil here.

cannabis leaves
Raman spectroscopy sorts male hemp plants from females
Cannabis sativa (courtesy Wikipedia)

It turns out that growers of medicinal cannabis and those catering for a more recreational market love females but hate males. Gender bias is certainly in the news today but horticulturalists have long known that female hemp plants have higher levels of pharmacologically active compounds than male plants. The difference in yield between the sexes is so large that it can make cultivation uneconomic.

There are three main pharmacologically active chemical compounds in hemp (Cannabis sativa) plants. Cannabidiol (CBD) and cannabigerol (CBG) are the main cannabis compounds of pharmaceutical interest, while hemp plants which contain more than 0.3% tetrahydrocannabinol (THC) are frequently classified as marijuana. Collectively, this family of compounds are known as cannabinoids.

Male cannabis plants produce male flowers and female plants produce female flowers with higher levels of cannabinoids. Flowers are relatively easy to tell apart but the plants need to be identified as male or female before flowering so that males can be removed and preferably not cultivated at all. It is therefore important to pharmacology research and therapeutics development that plants are identified before flowering. Genetic tests exist but cannot give an immediate result in the field.

Researchers at Texas A&M University have developed an immediate, non-invasive method of determining the sex of hemp plants with a success rate of up to 94%. Professor Kurouski’s team used a handheld Raman spectrometer to measure not the cannabinoids themselves but the pigments that give plants their colour. A commercially available Raman spectrometer (shown below) with a 830 nm wavelength near infrared laser was used to analyse the pigments in the plant leaves.

Agilent Resolve handheld Raman spectrometer (courtesy of Agilent Technologies Inc.)

Raman spectroscopy is a powerful chemical analysis technique that records a ‘chemical fingerprint’ of the vibrations of bonds in molecules. When a laser is focused onto a substance, almost all over the laser light is reflected or scattered with exactly the same colour as the laser. A tiny fraction, less than a millionth of the incident light, interacts with vibrations of molecules in the substance. As a result the light loses energy and becomes red-shifted. A Raman spectrometer spreads out the red-shifted light to reveal an array of slightly different coloured spots. Each one of these light peaks corresponds to a specific molecular vibration and forms the Raman spectrum or ‘fingerprint’ of the substance.

Averaged Raman spectra of female (red) and male (green) hemp leaves (*normalised to the 1440 cm-1 peak) (courtesy Texas A&M University and Springer-Verlag)

Averaged Raman spectra from male and female plant leaves showed small but consistent differences. Female plant leaves gave higher intensities of peaks associated with plant pigments and other biomolecules. Plant pigments typically include chlorophyll, lutein, lycopene, beta carotene, cryptoxanthin and zeaxanthin.

In order to investigate which plant pigment could be contributing to the differences between the male and female Raman spectra, the Texas A&M team took leaf samples from the same plants and analysed them by high pressure liquid chromatography (HPLC). HPLC comparison showed that female plant leaves generally had higher levels of pigments and had clearly higher levels of the pigment lutein. Differences in the Raman spectra of male and female hemp leaves were therefore associated with the plant carotenoid lutein.

In order to make a predictive model they used partial least squares discriminant analysis (PLS-DA), based on two components of the second derivative of the Raman spectra. When the male and female results were analysed based on the two components, it was found that around 90% of young plant leaves could be correctly identified as male or female and 94% of mature plant leaves could be correctly classified.

Professor Dmitry Kurouski and his group have therefore developed a new method of quickly determining the sex of young cannabis hemp plants in the field or greenhouse. The Raman method has the potential to improve the efficiency of medicinal cannabis production and aid research into new pharmacologically active substances.

You can find the full paper recently published by Springer-Verlag here.

News from Dmitry’s group at Texas A&M can be found on his website.

Related research on Raman analysis of hemp leaves can be downloaded from this Open Access Source.

Related research on using Raman analysis to measure plant disease can be found here.

If you want to try your Python skills at PLS-DA there is a useful practical introduction here.

Precision viticulture coming to fruition
BACO multispectral leaf imager on a grape crate

Mists, check. Mellow fruitfulness, check. Maturing sun, – eventually check! In the UK the 2021 growing season has been good but not exceptional. A late spring followed by a warm June and July gave way to a disappointing August with plenty of warm damp days to encourage the development of mildew. It has been fascinating to follow the 2021 Llanerch Precision Viticulture Pilot Study from bud-burst to final harvest. As followers of the study will know, we started in April with the installation of a SensIT microclimate weather station and then made the first BACO multispectral leaf measurements at the end of May. BACO leaf measurements continued approximately every two weeks until harvest was completed on 22nd October.

Season of mists and mellow fruitfulness,

Close bosom-friend of the maturing sun;

Conspiring with him how to load and bless

With fruit the vines that round the thatch-eves run;

To Autumn by John Keates (1795-1821)

SensIT and BACO are two of the key components of the Xloora precision viticulture system. SensIT is an IoT (Internet Of Things) weather station measuring %RH, T, wind speed and direction, wet-leaf, rainfall, sunshine and air pressure. Hourly readings are uploaded to the Xloora cloud platform and used to predict the likelihood of disease development. BACO uses seven different wavelengths of light from deep blue to near infrared to take pictures of vine leaves. Ratios of these images can give measures of leaf pigments such as chlorophyll, carotene and anthocyanin. They can also give early indication of disease development, forewarning the farmer of problems before they are obvious to the human eye.

  • New SensIT in vineyard
  • BACO multispectral leaf imager
  • reichensteiner multispectral leaf measurements

BACO also has GPS so that the locations of vine readings can be associated with blocks of different grape varieties. The web browser user interface shows individual measurements, alerts and reports over longer periods of time.

The grape harvest at Llanerch this year has been tremendous, a great crop achieved with few interventions to control disease. From reichensteiner to solaris, to seyval blanc, phoenix and even the old triomphe d’Alsace, vines have been heavy with bunches of grapes. A great collection of outputs from the vineyard. What about the Pilot Study?

Over the season there were more than 3,500 IoT data uploads and more than 28,000 microclimate records generated. Almost 300 multispectral sets of images were generated by BACO making more than 2,000 leaf images in total.

Detailed analysis of the data, reports and alerts has just started but some interesting results are already clear. At harvest just one block, the reichensteiner, had obvious signs of developing mildew on some of the vines. A time series of representative leaf images from the rows in question shows the progressive change:

NDVI leaf images of susceptible reichensteiner vines, acquired with BACO from June to October

It appears by visual inspection that the leaf ratio (normalised difference vegetation index, NDVI) images become much less uniform as the disease burden grows through the year. Finally mildew is evident to the human eye in late October.

SensIT microclimate data from the vineyard at Llanerch was compared with a feed from a commercial weather station in the area. While the temperature and relative humidity values were generally similar, there were notable deviations.

Comparison of both the temperature and Relative Humidity data from the same period in August reveals that the microclimate of the vineyard follows the general trend of the meteo reports but changes are much more pronounced. Presumably this is because the vineyard at Llanerch has an open aspect which warms and cools more quickly than the location of the meteo report sensors (which was not known precisely). Accurate data from the vineyard is likely to be crucial to the success of predicting the start and spread of diseases like downy and powdery mildew.

A more detailed analysis of the Xloora precision viticulture platform results will be carried out over the coming weeks in preparation for extending the trial next year.

Visit us at the Vineyard Show 2021 at the Kent Showground on 24th November, booth S8

Read more about SensIT, BACO and Xloora here

Cassava virus infection detected with handheld multispectral imager
Healthy cassava leaves (image courtesy of Pixabay)

One of the most important sources of carbohydrate energy in equatorial countries comes from the cassava plant. It looks like a trendy house plant but the roots are large tubers that provide valuable nutrition. Tubers can be boiled and mashed or dried, ground and turned into flour. Cassava is a robust crop but suffers from two virus diseases that can ruin an entire crop with very little warning until the tubers are dug up at harvest. Rotten tubers cannot be used for food and signs of viral infection are not obvious on the leaves or stems until it is too late to plant a replacement crop. Propagation of cuttings for the following year’s crop is also affected by virus so potentially two years worth of food could be lost in a single infection.

Cassava mosaic virus (CMV) and cassava brown streak disease (CBSD) are the two main viral diseases responsible for crop loss. There are now cassava varieties that are resistant to CMV but CBSD is still problematic. There are various biochemical diagnostic tests for the diseases but the most reliable, a PCR test, is expensive, invasive and requires a relatively high level of viral load. Prompted by the need for a better, quicker test a team of researchers from University of Manchester, North Carolina State University, Rutgers University and the International Institute of Tropical Agriculture have developed a handheld multispectral leaf imager that detects the presence of CBSD before signs are obvious to the human eye.

Cassava tubers ready for processing (image courtesy of PixaBay)

In a pre-print paper last month, Hujun Yin and colleagues reported how a compact 14 wavelength multispectral leaf imaging device utilising machine learning successfully classified diseased plants and control plants. Photos of the device are shown below.

Multispectral leaf imager (a) and sample chamber window with grid to hold leaf flat and LED ring for illumination (b) (image courtesy of Creative Commons, Research Square and the authors)

The Manchester study comprised three trials, each containing cassava plants naturally immune to CMV to minimise the chance of random viral infection from another source. All three trials had three treatment groups: controls; CBSD inoculated; and E. coli inoculated. The last treatment groups were used to test the susceptibility of the inoculation method itself, E. coli should have no effect on the health of the cassava plants. Plants were measured at days 7, 14, 21, 28, 52, 59 and 88 days post inoculation (dpi). Leaf images were recorded using 14 different LED light wavelengths (395, 415, 470, 528, 532, 550, 570, 585, 590, 610, 625, 640, 660, 700 and 880 nm). At each time point plant leaves were also given scores from 1 to 4 based on how they appeared to the eye. Typical leaves are shown below.

Cassava leaf scores (1-4) at days post inoculation

In trial 2, PCR tests were performed on leaves at each time point and visual scores were documented. The visual scores showed a progression with time for the inoculated leaves as expected (see below).

Cassava leaf scores as trial 2 progressed

To verify that the virus was indeed successfully inoculated into the plants, PCR tests confirmed the virus present in some plant leaves after day 52, with the highest levels at the end of the time course.

Analysis of the leaf image spectral hypercubes (14 wavelengths x 12 random groups of leaf pixels) produced metadata consisting of six vegetation indices (VIs); average spectral intensities (ASIs) and texture (a measure of the variation of the intensities from pixel group to pixel group on a leaf). Interestingly, even some simple VIs were capable of distinguishing some diseased plants from healthy ones (more than 60% successful classification at day 52, comparable to the PCR testing).

Metadata was used to create a classifier based on measurements taken from plants with positive PCR test results. Rather than use a convolutional neural network (CNN) approach to classify the images or metadata, the team used a Support Vector Machine (SVM). SVMs have the advantage that they typically do not require high computing power and they can be intuitively quantitative using simple regression to find the best dividing lines between image categories. SVM produced a marked improvement in classification with better than 80% success as a result of using positional information in addition to VI. An introductory reference is given at the end of this post.

The group made one further improvement to their model and this was to combine subsets of classifiers produced by the SVM. They called this approach Decision Fusion (DF) and Probabilistic Decision Fusion (PDF), which is basically saying if one classifier doesn’t work, combine it with another and see if the performance improves. Finally they achieved sophisticated classification of diseased and healthy plants at day 53 and 80-90% success.

It will be very interesting to see multispectral imaging applied to more diseases and to the classification of differing diseases.

The pre-print paper can be found here, courtesy of Research Square and the authors Yao Peng, Mary Dallas, José T. Ascencio-Ibáñez, Steen Hoyer, James Legg, Linda Hanley-Bowdoin, Bruce Grieve, Hujun Yin.

A nice introduction to Support Vector Machines from MonkeyLearn can be found here.

Information on our own BACO multispectral leaf imager can be found here.

Counting grapes with machine learning
bunches of grapes
Bunches of grapes in a full canopy of leaves and stems

As the grapes swell in the late summer sun and rain, vignerons start thinking about the harvest. What is the yield going to be? how much sugar will there be in the grapes and finally how many bottles of wine can be made?

It would be really useful to have a way to predict the yield of each vine, sector and vineyard. Reliability is important because securing the wrong amount of vineyard labour, vat capacity or bottling can lead to either higher costs or loss of production. Hand counting is reliable but extremely time-consuming if a good representative portion of the vineyard is counted by visual inspection. Wouldn’t it be great if you could automate the inspection process by mounting a camera on a drone or quad-bike and then use AI (artificial intelligence) to pick out the grape bunches from the images of leaves, stems and bunches as the camera races up and down the rows of vines?

Well, this has been done by a number of developers and Corbeau reviewed one clever solution recently. The big challenge is to overcome the confusion of bunches partially obscured by leaves and stems.

Machine learning (ML) examples often use images of salads, with bright red tomato slices, green lettuce and orange carrot sticks. These can be easily sorted or counted using colour filters. In the vineyard you encounter green leaves, grapes, stems and tendrils. Each of them growing at different angles and with a range of different shades of green. Counting grape bunches is a much more difficult problem to solve than counting tomatoes.

Undaunted, Corbeau decided it was time to investigate Open Source AI as a way to count grape bunches in vine images. There are a wide range of Open Source tools available, some of which require programming expertise and some which do not. The main drawback of using Open Source tools is that the documentation is typically either very limited or written in a strange English dialect spoken only by Microsoft employees.

Example of Microsoft English dialect

Corbeau started with a set of 30 colour vine images taken with a smartphone. Each image was taken in full sun and contained random combinations of grapes, leaves, stems and tendrils. Images typically covered a vine area of around 1 square metre. The first task was to create sets of annotations to identify features of interest. This was done manually with the Microsoft API, drawing labelling boxes around grape bunches and stems in the images. An example annotation is shown below.

Bunch label (yellow) and stem labels (green)

A CNN (convolutional neural network) was chosen to create a machine learning model. Convolution methods are useful ways to emphasise shapes in images. They work by trying to make the image look like a particular pattern, for example parallel vertical lines or circles. After convolving a reference labelled box with a pattern, the simplified new image is assessed: is it more ‘like’ the pattern than something else? When many different patterns are convolved with an image, a set of patterns will tend to characterise one image more than another one. Each of the pattern ‘like’-ness tests can be made a decision point in making an image classification. In this sense the decision-point network is an intelligent classifier. The CNN requires some ground truth, defined by human definitions of what different bunches of grapes look like. It becomes intelligent by learning from human definitions of different bunches and stems.

The set of convolutions which classify a bunch and a stem are different and require verification and testing before they can be used as a machine learning classification model. New vine images were taken to verify the performance of the classification model. Bunch classification was more successful than stem classification because a much higher proportion of bunches were annotated in the learning set of images than stems. After a few hours developing better classification models, the model produced a 0.77 confidence level for bunch identification and 0.12 confidence level for stem identification. When presented with similar images (size, resolution, range) the model should be capable of identifying grape bunches and counting them. The model was deployed as a TensorFlow classification task, ready for counting grape bunches in additional grapevine images.

To test how this ML approach could be used in the vineyard, a burst series of smartphone images were recorded as the phone was carried parallel to a test vine. Running the ML model produced a list of potential image boxes found by the classifier, shown below.

Smartphone burst image of vine with annotation labels

Two bunches were identified with 1.000 (right hand bunch) and 0.928 (top left bunch) confidence levels. The next image box (at the bottom left side of the picture above) was not a bunch and had a 0.019 confidence level.

These results are an encouraging start to the machine learning bunch counting project. If you would like to take part in the project, please get in touch with Pierre Graves by completing the Contact form!

Smarts or knowledge – which one wins at precision agriculture?
Wheat field in Hungary (courtesy Wikimedia Commons)

Imagine a competition to produce the highest yields of winter wheat between Sheldon Cooper and a winner of the Apprentice. Who would win? It’s tempting to choose the Big Bang brain-box but what if Lord Sugar’s apprentice had spent 10 years working on arable farms in the UK, Australia and the USA before joining the house of hopefuls?

Precision agriculture poses similar questions. Is it better to have the deepest understanding of plant biology, soil chemistry and metrology or the widest? Is it better to have the most detailed mathematical model of plant growth or the most robust?

These questions got an interesting airing in a recent paper by scientists at CSIRO (Commonwealth Scientific and Industrial Research Organisation) published in Field Crops Research journal. Andre Colaco and colleagues considered the question of how to optimise the harvest of winter wheat by supplying nitrogen. They compared the performance of detailed advisory models that used single field sensors with other less detailed models, that used multiple sensor inputs from the field. What does this mean? One advice system for example, might be based on measuring nitrate concentrations in crop leaves and include a whole set of equations describing how nitrate ions move from fertiliser pellets on the soil surface into the roots, up the stems and into the leaves. Another might also use sensor measurements of temperature, humidity, rainfall, wind speed and hours of sunshine; but use only basic assumptions about transport of macronutrients.

Which approach is more successful – the deep one or the wide one?

Wheat leaf (courtesy Wikimedia Commons)

Colaco and colleagues proposed on-farm experimentation and machine learning with multiple sensor inputs as a better way to apply artificial intelligence to crop management. They took 20 years of publicly available winter wheat data from Oklahoma State University (OSU) and used it to test different deep and wide approaches to advising how much nitrogen should be applied in mid-season to realise the potential yield of a crop of wheat. Four different approaches were tested, using half the historical test data as learning-sets and half as test-sets.


Their first approach was based on predicting a Yield Potential and then assessing the difference between the nitrogen content of that yield (i.e. kg of wheat per hectare) and the available nitrogen in the soil. The difference is the recommended nitrogen that must be applied to the field (in kg N per hectare). How were these two numbers calculated? Yield potential for wheat was measured by OSU using the so-called GreenSeeker sensor model. GreenSeeker is a handheld multispectral device that measures the NDVI (normalised differential vegetation index) of field crops. By comparing the NDVI response of field samples against a look-up table, an in-season estimated yield was obtained. Basically the greener the field test-strip, the bigger the expected yield. Farmers have been using simpler metrics like crop height to predict yield in a similar fashion for some years.

On-farm experimentation data showing the Optimal Nitrogen Rate and the Optimal Nitrogen Recommended as the difference between the predicted yield of the current crop and the optimal yield (courtesy Field Crops Research Journal)
GreenSeeker multispectral sensor (courtesy Trimble Agriculture)

Nitrogen demand for such a yield was calculated using standard assumptions: nitrogen content of wheat is typically 2.4% by weight and the efficiency of nitrogen uptake is again typically 44% of that applied to a field. The nitrogen recommendation for Approach 1 was therefore equal to:

[(expected nitrogen in predicted yield) – (available nitrogen in the field) ] x uptake efficiency


The second approach was probably more appealing to chemists, using assumptions about the nitrogen response rate (i.e. the concentration of nitrogen multiplied by some rate constant driving the complex growth reaction) rather than a nitrogen mass-balance. With Approach 2, the NDVI values of wheat in different test strips were used directly as parameters for plant growth rate. The in-field experiments required mid-season measurements of wheat test strips with different levels of applied nitrogen at the start of the season and aimed to find the plateau NDVI value that corresponded to the maximum level of nitrogen that the plants could take up given the soil and climate conditions. NDVI measurements in this case were made with a Crop Circle sensor and converted to a recommended nitrogen application rate using a look-up table directly.

Crop Circle multispectral sensor (courtesy of Holland Scientific)


The third approach introduced Machine Learning (ML) to Approach 1. First a whole load of seasonal variables were introduced in to the yield prediction calculation as possible solutions to the variation that was seen in natural year-to-year variation in crop yield. The table below shows the type of variables taken into account.

Additional seasonal variables considered in Machine Learning (courtesy Field Crops Research Journal)

A simple regression analysis was made to identify the most influential seasonal variables. Next a Machine Learning method known as Random Forest (RF) was used to investigate various decision-tree models (combinations of the most influential seasonal variables) that could possibly lead to an applied nitrogen recommendation at mid-season. There are some useful video links at the bottom of this Insight article that explain decision-trees and RF. It turned out that the most influential variables for Approach 3 were: NDVI, RI (response index), soil moisture and rainfall. The Random Forest trees were derived using half of the historic OSU winter wheat data and refined so as to create a Machine Learning model that could be used to predict the recommended nitrogen application for the remaining 50% of the historic OSU data.

Selection of seasonal variables based on minimising the RootMeanSquareError (courtesy Field Crops Research Journal)


The forth and final approach was to apply Machine Learning to Approach 2 and produce a model Colaco called Data Driven. For their Data Driven approach, all the available sensor data was added to Approach 2 so that a very wide range of information was used to find the most influential seasonal variables. This time all 12 of the variables in the above table were used for Approach 4. Again the Machine Learning Random Forest method was used to find a set of decision-trees that best represented the 50% learning set. This set of decision-trees was then used to predict the recommended nitrogen application for the remaining 50% of the historic OSU winter wheat data.

So after all this modelling and number crunching what was the result?

The performance of the four Approaches was evaluated by plotting the recommended nitrogen rate against the actual optimal nitrogen rate. An R2 value of 1.0 would give perfect goodness of fit and the RMSE root mean square error values were used as an indication of the accuracy of the Approach. Based on these criteria, Approach 4 is the clear winner.

  • Approach 1 R2 = 0.42 and RMSE = 31.3 kg N ha-1
  • Approach 2 R2 = 0.63 and RMSE = 21.9 kg N ha-1
  • Approach 3 R2 = 0.51 and RMSE = 26.0 kg N ha-1
  • Approach 4 R2 = 0.79 and RMSE = 16.5 kg N ha-1

Oklahoma State University winter wheat yields varied from 1 tonne per hectare to 7 tonnes per hectare and over the 20 years over all the fields in the database, the optimal mid-season nitrogen application based on actual yields varied between zero kg per hectare and 110 kg N per hectare. Reducing the error in nitrogen application from 31.3 to 16.5 kg N per hectare by using Approach 4 rather than Approach 1 is therefore a significant optimisation of nitrogen supplementation. This would be expected to result in lower costs (when Nrecommended is too high) and higher yields (when Nrecommended is too low).

Applying machine learning can improve the use of either direct or indirect parameters in precision agriculture. Using multiple variables that farmers encounter from year to year and from field to field can produce more robust advice, even when the variables are used directly, without knowing exactly how they affect the yield of a crop.

Smarts or knowledge? Precision agriculture gains from the use of both better understanding and knowledge. When machine learning is added to a method, on farm experiments and local variables like microclimate can produce the very best results.

Corbeau for one, can’t wait to see the results of applying this approach in UK vineyards as well as winter wheat in the USA.

  • Find the whole article from Colaco in Field Crops Research
  • Read a short related article on a similar study in Australia
  • Watch a related video featuring one of the CSIRO team
  • Watch a FUN explanation of Random Forest machine learning, yes really!
  • Subscribe to receive our own monthly precision viticulture pilot study newsletter
Vineyard yield estimation with smartphone imaging and AI
vine shoot and cluster
Vine shoot and flower buds

There is so much potential in those tightly closed flower buds. Over the course of the summer the flowers on vines bloom, turn into tiny green spheres and ultimately heavy bunches of grapes. Or at least that is the hope of the vineyard owner and winery. Accurately estimating the size of the harvest well in advance has a number of advantages. Early yield estimation allows the right number of pickers to be hired at a reasonable rate and the right amount of tank space, bottling and packaging.

Yield estimation by manual visual inspection is the method recommended by the Grape and Wine Research and Development Corporation (GWRDC, Australian Government). In a 2010 guidance sheet, Professor Gregory Dunn (University of Melbourne) recommends randomly counting grape clusters across entire vineyard parcels. There is good correlation between the number of grape clusters per vine and the ultimate yield.

Correlation between vineyard production and grape cluster count (courtesy of GWRDC)

It is not always easy to make an accurate random sample count of bunches and previous yields can vary from year to year. Counting in vineyards in cool climates like the UK has both these difficulties because seasons tend to be more variable than further south and more vigorous vines tend to be planted. Vigorous vines like Reichensteiner produce thick leaf canopies that obscure developing fruit.

Researchers at Cornell University recently reported a novel, cheap and effective method of early yield estimation based on smart phone video footage of a whole vineyard and artificial intelligence (AI) analysis of the recorded images.

smartphone_ATV grape vine imaging
All terrain vehicle with smartphone on gimbal and LED lighting panels (courtesy of Frontiers in Agronomy and Cornell University)

Stereo-imaging and LIDAR measuring devices have been around for a while now but they are expensive, think £’000 to £’0,000 to equip a vineyard with a system. The Cornell system is essentially a smartphone on a gimbal with a lighting boom that can be driven or walked up and down the rows of a vineyard at night.

In addition to being a low cost solution, it is also effective. They report a cluster count error rate of only 4.9%, almost half that of traditional manual cluster counting. Improved cluster counting and therefore better yield estimation is obtained mainly due to better random sampling of vineyard and better identification of clusters. Over two growing seasons the Cornell team found that early video imaging gave the best results because small clusters and shoots were not obscured by large leaves and a dense canopy.

So how did they turn a rather long video into an accurate and precise cluster count?

Firstly the different objects (leafs, shoots, clusters, posts etc) in the video needed to be classified. Building the classifier is the major task in a machine learning implementation. There are a number of Open Source tools readily available to do this. They chose a Convolutional Neural Network (CNN) to identify objects in the video images. CNNs apply digital filters to simplify and exaggerate whole images, making them look more round, jaggy, linear etc. These are applied to the image under test to find the combination that finds a result that fits a set of defined examples of a particular object. But how does the CNN know what an object is? who defines the objects? The Cornell answer is student interns. They were given sets of training images and a copy of Open Source Python app LabelImg and tasked with drawing boxes around each object of interest and giving them the label ‘cluster’. The other useful source of information to train the CNN was the so-called Microsoft COCO (Common Objects in COntext) dataset. COCO is essentially a large set of sorted images that are not grape clusters. The image below shows how clusters are identified from video footage.

vine clusters located by CNN
Vine clusters identified by trained CNN (courtesy of Frontiers in Agronomy and Cornell University)

TensorFlow, a user-friendly Open Source platform was used to train the neural network and apply it to the video footage.

It would be fascinating to apply this cost-effective and early yield technology in the UK, where the climate is warming but seasons are still variable.

Read the whole Frontiers in Agronomy paper here.

Watch a brilliant explanation of AI from Microsoft’s Laurence Moroney here.

Find out more about TensorFlow here.

Download LabelImg here.

Find out more about the COCO Project here.

table grapes
Precision viticulture for table grapes
Table grapes (courtesy

Some farmers do not pick grapes, squash them and then ferment the juice. Some farmers just sell them for eating. And why not? Grapes have long been the gift of choice for hospital visitors. The glucose sugar in grapes is absorbed quickly into the blood stream to provide a burst of energy for a sick patient. Table grapes are also a major global fruit crop, with almost 26 million metric tonnes being grown last year, 2020. Typically table grapes account for roughly one third of global grape production, less than 10% are grown for dried fruit and the bulk are grown for wine making.

It is logical then that table grape growers are becoming more and more interested in new precision viticulture technologies being adopted in winery vineyards. The key questions for farmers are: which technology to adopt? what are the benefits? how easy is it to use? These are the questions a group from the Agricultural University of Athens (AUA) have been keen to answer and they recently published their findings in a paper last month.

Emmanouil Psomiadis and his colleagues from AUA made a comparison of two common approaches to vineyard monitoring: remote (via satellite) and proximal (ground based) multispectral analysis of a table grape crop canopy. Multispectral analysis measures the amount of light of different wavelengths reflected by a crop. Whereas the human eye can only see three different colour bands (red, green and blue), imaging detectors like the Multi Spectral Instrument (MSI) on the European Sentinel-2 satellite have up to 13 bands from the violet end of the visible spectrum to the invisible infrared. Having more spectral bands offers the possibility of better resolution of different plant pigments like chlorophyll, xanthophyll and lycopene. More accurate measurement of plant chemistry can potentially be used to pick up early signs of disease and nutrient deficiencies.

ESA Sentinel-2 satellite (courtesy European Space Agency)

The team from AUA were also interested to discover which of the many so-called Vegetation Indices (VIs) were the most useful. One of the features of satellite measurements of the earth surface is that the light on the ground is not always the same. It’s true that the sun’s output is pretty much constant but the angle of the sunlight falling on plants in a vineyard changes throughout the day and from season to season. The atmosphere tends to scatter blue and violet light more than red and infrared and therefore sunlight appears red in the evening and yellow in a clear sky at midday. Sunlight reflected back to an orbiting satellite from a vine canopy therefore changes from day to day.

Vegetation Indices, which are ratios of different spectral bands, were introduced to overcome these variations. Three of the most widely used are Normalise Difference Vegetation Index (NDVI), Normalised Difference Red Edge (NDRE) and Fraction of Absorbed Photosynthetically Active Radiation (FAPAR). Researchers have their own favourite VI but each tries to accurately determine the fraction of the ground covered by green vegetation within a particular image pixel.

Measurements with proximal multispectral instruments have a number of advantages over satellite imaging. Handheld or tractor mounted instruments cannot cover the vast areas caught in a single frame by a satellite detector but they are never interrupted by cloud and they can be fitted with their own light sources. With a consistent light source and measuring just the vines, rather than the earth or cover crop between them, proximal multispectral imaging could offer better quality vegetation indices than satellites. The AUA group chose to use satellite data from the ESA Sentinel-2 satellite and proximal multispectral imaging from a Holland Scientific Crop Circle ACS-470 (see below mounted on a quad-bike).

Crop Circle ACS-470 on boom attached to quad-bike for multispectral measurements of vine canopy
(courtesy Agricultural University of Athens and Agronomy Journal)

Slide showing spectral wavelength (microns)and width of Sentinel-2 detector bands and chlorophyll absorptions (arrows)

To test the ability of both proximal and remote multispectral imaging to quantify grapevine vigour (as a VI), Psomiadis and colleagues chose a small vineyard near Corinth, Greece growing table grapes. They monitored the vineyard through the season, recording proximal multispectral data from vine canopies and downloading satellite images from the start of ripening (veraison) to the point of harvest. The best correlation between the two sources of spectral measurements was obtained with NDVI and FAPAR (Fraction of Absorbed Photosynthetically Active Radiation) indices.

NDVI = (Reflectivity(Near infrared) – Reflectivity(Red)) / (Reflectivity(Near infrared) + Reflectivity(Red))

Equation for calculating NDVI; the higher the ratio, the greener the canopy

Interestingly it was found that the best correlation (correlation >0.87) between proximal and remote measurements was at the later stages of grape ripening. Sentinel-2 satellite images have at best 10m spatial resolution therefore measurements earlier in the growing season, before the full development of the leaf canopy will include a bigger contribution from bare earth or a cover crop grown between the rows of vines.

Many more details about the vineyard studied and the calculation of different VI ratios can be found in the Agronomy journal paper. Importantly, the study has shown that multispectral measurement of vineyard vegetation indices using both remote and proximal technologies give consistent results.

The paper suggests three answers to the key questions from grape farmers:

  • use either remote or proximal multispectral measurement, ideally both as they are complementary
  • the key benefit demonstrated from the study is that plant vigour can be non-invasively quantified either with low resolution satellite imaging or higher resolution proximal measurement
  • Sentinel-2 imagery is freely available but some knowledge is required to create the VI maps of a vineyard. Proximal measurement is quicker than traditional field walking but with GPS tracking it can be straightforward to relate VI ratios to vines

Read the full paper here.

Find out more about the ESA Sentinel-2 mission here.

Find out more about Corbeau’s own multispectral vineyard project.