Characteristics of CD44 alternative splice pattern in the course of human colorectal adenocarcinoma progression

Background CD44 is considered as ‘a’ metastasis associated gene, despite the fact that it is an umbrella term for a group of molecules produced from a single gene by alternative splicing. However, little consideration is given to the above in the literature of colorectal carcinomas as well as other tumour types, leading to confusion and contradictory results about its possible role in tumour progression. Methods We compared the CD44 alternative splice pattern (ASP) of three genetically different human colorectal cancer cell lines (HT25, HT29, HCT116) using a series of PCR reactions and next- generation sequencing method, as well as identified a colorectal adenocarcinoma specific CD44 ASP. This ASP was further investigated in terms of its qualitative and quantitative stability in our experimental iso- and xenograft mouse models for colorectal cancer progression. A complex preclinical experimental set-up was established to separately test the different steps of tumour progression and the role of tumour microenvironment, respectively, focusing on the role of ‘CD44’ in this process. Results We managed to present a colorectal cancer-specific CD44 ASP, which remained unchanged from cell lines throughout primary tumour formation and metastatic progression. Furthermore, we report a unique roster of all expressed CD44 variant isoforms characteristic to colorectal cancer. Finally, on quantitative assessment of the variable exons v3 and v6, higher co-expression levels were found to be characteristic to metastatically potent tumour cells. Conclusion Particular CD44 variant isoforms seem to act as “metastasis genes” via tumour microenvironment-driven shifts in v3 and v6 expressions. However, this function may just affect a minority of tumour subclones. This fact and the huge potential number of different CD44 splice variants that can contain v3 and v6 domains can explain incoherence of clinical studies regarding functional asessment of CD44 variants, as well as diminish the chances of using CD44 variants for predictive purpose.


Background
In developed countries, colorectal cancer (CRC), or rather its progression to metastatic disease, accounts for 25% of tumour deaths [1]. Characterising the metastases associated processes is therefore of crucial importance for identifying ways of earlier and more sensitive diagnosis, more defined prognosis, and possibly in the selection of patients for targeted therapies.
Each individual step of the metastatic cascade is the result of complex molecular interactions, regulated by several already identified and/or unidentified genes [6]. One of the candidates of key importance has been CD44. In fact, it is one of the most investigated molecules in the metastatic process of several malignancies [7][8][9][10][11][12][13], among them that of colorectal cancer [14][15][16].
CD44 was first described to be a lymphocyte homing receptor [17]. The role of CD44 was later proven in fetal arteriogenesis [18], as well.
A number of studies demonstrated the expression of CD44s (standard version) on most tissue types of epithelial origin (stomach mucous membrane, small intestinal mucosa, prostate, ductal epithelium of breast, skin, hair follicules, transitional cell membrane of utgenital tract) [22][23][24][25][26]. Regarding colonic mucosa, immunohistochemical examinations demonstrated that CD44s (identified with antibody against the standard region of the molecule) is located at the level of basal crypts, similar to the "intestinal stem cells" of the small intestine sitting between Paneth-cells [27][28][29]. This means that proliferative cells of colonic mucosa, as well as cells in the basal segment of differentiation zone are able to provide CD44s expression, while differentiated mucosal cells don't show membrane positivity on immunostaining any more [30]. This fact gains special importance when considering the change of 'CD44' expression characteristics from the normal, the dysplastic and the malignant colorectal tissues [31].
With the presence of CD44 protein in so many tissues, consideration has to be given to its physiological and pathological functions.
'CD44' has a variety of biological functions such as cell-cell, cell-ECM interactions, tumour cell migration [40,41] or even chemoresistance [16,[41][42][43][44][45][46][47][48][49][50]. This multitude of representations and functions is less surprising when considering that through its 9 variable exons (v2-v10), hundreds of different isoforms can be transcribed and translated from the single CD44 gene resulting in a mixture of glycoproteins with potentially different functions [51][52][53]. Talking about 'CD44' and studying its 'over expression' therefore is rather debatable and not as straight-forward as it was previously thought. A basic understanding and consideration of alternative splicing should be warranted when dealing with this versatile gene. For the same reason, when using probes (i.e. primers or antibodies) against the standard region of CD44 (CD44s), one will detect a mixture of isoforms as they all share the same standard region [54,55].
Determining the role of 'CD44' is further complicated by the difficulties to identify all the isoforms potentially characterizing certain cell or tissue types. The trend is to focus on the expression of a single variable exon, which, of course, will only show the summation of isoforms containing the exon in question. However, these isoforms might only share a single variable exon, therefore they can be of different functions. In addition, several other CD44 variant isoforms can be present within the same cell, adding further functions to 'CD44' [56][57][58][59]. Furthermore, different cells of a tumour can express various, possibly different sets of CD44 isoforms. As most of our tests examine a group of cells from the tissue at the same time, the results are summated, failing to represent the cell-to-cell differences in details [6]. In colorectal carcinoma the most studied variant exons / protein domains are v3 and v6.
CD44v6 functions as receptor of HGF/SF cooperating with c-met [61,62], regulating EMT and taking part in the positive feedback loop of K-ras activation driving both Ras signal transduction pathway and CD44-splicing machinery [3,63,64]. Additionally, this is the domain most frequently found to be associated with metastatic phenotype in the literature [65].
Unsurprisingly, the studies of detecting 'v3' and 'v6' under different experimental and clinical settings do not lead to coherent results, and can even be contradictory. This also can be observed in the literature regarding colorectal cancer [14,[66][67][68][69][70][71].
Recently, characteristic CD44 isoform patterns of different normal tissue types have been described in the literature [24][25][26]72,73] and our earlier studies described CD44 isoform patterns characteristic of different tumour types, respectively (data publication in progress, Rásó-Barnett et al.). Most of the literature agrees that 'CD44' plays an important role during tumour progression based on 'overexpression' of specific variant regions containing isoforms at mRNA and/or protein level. However, no data is available concerning the pattern of CD44 isoforms and its modulation during tumour progression. Our previous study on CD44 alternative splice patterns showed no qualitative change of the melanoma specific 'CD44-fingerprint' , while the expression level of all isoforms uniformly increased during the metastatic process (data publication in progress, Rásó-Barnett et al.).
The aim of our research was to find answers to the following two questions: (1) Is there a colorectal carcinoma specific CD44 isoform expression combination like to one identified in human melanoma? If there is such pattern, is it qualitatively / quantitatively stable during tumour progression? (2) What is the background of the incoherent quantitative results on the expression of the most investigated v3 and v6 variable exons / protein domains?
To answer the first question, we designed primer pairs covering the entire variable region (primers A-B, see Methods) for our quantitative and qualitative PCR reactions. The expressed, i.e. transcribed CD44 isoforms were identified using direct sequencing and nextgeneration sequencing.
Due to the large number of potential isoforms, we used a series a PCR reactions to create the CD44 fingerprint, a simple, easy to handle representation of the expressed CD44 isoforms. We used this fingerprint to track the changes of CD44 expression during tumour progression [41,55,[74][75][76].
To answer our second question, we designed a complex experimental animal model to examine the functional role CD44v3 and CD44v6 containing isoforms play in metastasis formation. The main question was whether or not there is a change in the expression of these variable exons that could be associated to the metastatic process of human colorectal carcinoma. We implanted the same colorectal cancer cell population orthotopically to form real metastases in liver, as well as heterotopically. We performed two types of heterotopic implantation: the first involved subcutaneous implantation into permissive and non-permissive hosts, the second was direct implantation into the spleen so that the implanted cells could directly colonize the liver. The latter method created a new type of secondary tumour, which albeit is in the same localization as the liver metastases of the true metastatis model, has not been through the metastatic cascade. This experimental setup allowed to functionally differenciate between phases of metastatic process and host related changes, regarding the expression activity of the examined variable CD44 exons.

Results
Colorectal adenocarcinoma (CRC) specific CD44 alternative splice pattern (ASP) Human colorectal cell lines (HT25, HT29, HCT116) could be characterized by the dominance of five bands on agarose gel electrophoretograms (ELFO) when examining the variant region of CD44 with a primer pair testing and overlapping the entire variant region (Primer A -Primer B) Figure 1, Figure 2B. This PCR reaction would theoreticaly amplify each transcribed CD44 variant isoform. In three of the five bands (372 bp, 564 bp, 768 bp) we identified a particular CD44 isoform by direct sequencing. Even these bands, as well as those bands which were not definitely identified by direct sequencing, potentially 'hide' more, different isoforms. None of the dominant PCR producs that were identified by direct sequencing represented v3 or v6 containing isoforms. Next-generation sequencing (NGS) performing allele specific high performance sequencing managed to identify further isoforms including some of those which contain v3/v6 exon products. This method managed to differentiate more isoforms hidden behind the same ELFO bands. Furthermore, eight more isoforms with very low expression activities were identified by this sensitive method. These isoforms do not appear on the qualitative picture of ELFO. (Surprisingly, NGS provided evidence to the fact that CD44 isoform without any variant exons (CD44v0) does exist, in spite of the widely accepted theory that v1 variant exon is constitutively expressed) Figure 1.
Additionally, we identified a number of definitely expressed CD44 isoforms by four further PCR sets performed with primer pairs directly designed onto exons v3 and v6 (Primer pairs A-D, A-E, C-B, C-E) Figure 1, Figure 2A. Primers designed for particular exons (e.g. v3 or v6) gave evidence of that exon being transcribed, as a minimal premise of amplification on PCR. Some of these isoforms did not appear among the products of the reaction ' A-B' , either for some quantitative or particular PCR reaction technical reasons. Precise knowledge of exon lengths enabled us to calculate the expression of further isoforms according to product sizes in addition to the ones detected by direct sequencing of PCR products. These two methods provided a rich roster of definite and estimated CD44 isoforms which are represented by being isoforms of very similar size.
This way we identified CD44 isoforms expressed in CRC cell lines from a qualitative picture derived from five-primer-pair PCR series covering the entire length of the variable region of CD44 with overlapping sequences. By running the PCR products of the five different primer pair reactions on agarose gel always in the same order, a colorectal cancer specific alternative splice pattern (ASP) could be described for all three human colorectal cell lines (HT29, HCT116 and HT25) Figure 2B.
After all, there were 26 different CD44 isoforms identified in the qualitative CD44 fingerprint of colorectal cancer, 24 of which were proven by sequencing and only two remained 'estimated/calculated' isoforms Figure 1. However further isoforms can also be present, only hidden behind 'truncated' CD44 isoforms sharing the same truncated part of the whole molecule (i.e. Isoform 14-26 on Figure 1.).
Characterisation and examination of the stability of the colorectal CD44 'fingerprint' Comparison of tumours growing in vitro and in vivo CD44 pattern of our samples from the xenotransplantation model has proven that the pattern is preserved in vivo the same way it was seen in vitro. ASP of CD44 was found to be highly similar of HT29 cells cultured in vitro as compared to orthotopic primary (colon) or heterotopic primary (spleen, subcutaneously in adult and newborn mice) Figure 2D,E. HT25 and HCT116 cell lines were found to behave the same conservative way, (data not shown).  Figure 2A. Expressed CD44 isoforms in colorectal cancer were detected. PCR with A-B primer pair can theoretically provide all expresed isoforms, but for the reason of sensitivity limitations of PCR technique only five main bands were detected. PCRs with the other four primer pairs could reveal expressed (quantitatively inferior) isoforms containing v3 and v6 exon products.Some of the bands could be identified as distinct isoform products (proven by direct sequencing), others covered more, similar length isoforms identified by allele-specific next-generation sequencing or estimated by length calculation of the products. Estimated/calculated isoforms are shown in Italic, red letters. Twenty-six CD44 isoforms could be differentiated this way, some of them are just truncated, so that may in fact cover several isoforms sharing the same common variant exons.

Behaviour of CRC-specific CD44 alternative splice pattern (ASP) during the metastatic process of the tumours
To determine whether the expression pattern changes during the metastatic process in our xenograft system, we implanted HT29 cell suspension into spleen and orthotopically into large bowel, coecal wall. We then compared the CD44 pattern of the primary tumour and the liver metastasis/colony of the same animal in both models. The pattern was found to be unchanged Figure 2D. Similar, unchanged expression patterns were detected in the other two tumour types, HT25 and HCT116, respectively (data not shown).
In the case of HT29, the heterotopically (subcutaneous) implanted primary tumour showed the exact same pattern as the cell suspension in both the metastatic (newborn) and non-metastatic (adult) host. In metastatic cases (in each newborn animal) pulmonary metastases gave the same ASP, as well Figure 2E. The number of implanted animals in each arm of the model was three, the qualitative expression pttern was overally homogenous. (A total number of 40 animals were transplanted, 28 of them developed metastases, and we used 3-3 animals to check the CD44 fingerprint.) Quantitative change of certain exons (v3/v6) of the CD44 variable region during tumour progression of colorectal cancer in mice Mouse C26 isograft system CD44v3 and CD44v6 expressions of orthotopic (coonic wall) and heterotopic (splenic implantation) primary C26 tumours (four mice in each group), as well as spontaneous liver metastases and liver colonies (from splenic implantation) were compared using real-time PCR Figure 3A. Data indicate that while the expression level of both variable exons were more than one magnitude larger in the liver metastases compared to that of the primary tumour in the orthotopic implantation model (significant differences both on v3 and v6 expression), no significant alteration was detected between the spleen primary and the liver colony. Furthermore, it is remarkable that primary tumours of both systems showed similar expression levels.

Human colon cancer xenografts
All three implanted human colon cancer lines (HT29, HT25, HCT116) possessed similar metastatic potential in both systems, showing a 50-65 % metastatic rate.
Real time PCR measurement using CD44v3 and CD44v6 exon specific primers showed three different ways of behaviour in the three colorectal cell types. In representative samples we compared the relative CD44v3 and CD44v6 levels of the primary tumours and liver metastases or liver colonies in both systems, similarly to the isograft experiments Figure 3B.
In case of HT29, identically to C26, we detected significantly elevated expression levels in liver metastases after orthotopic implantation compared to the primary tumour of the colonic wall, while both variable exons were expressed at the same level in both the primary splenic tumour and the liver colony of the spleen-liver system. It is also interesting that primary tumours from both localisations showed similar expression levels.
HT25 behaves opposite to Balb/C-C26 isograft system. Primary tumours and metastases both showed identical expression levels in the orthotopic system.In the case of HCT116, the primary tumours provided extremely high CD44v3/v6 expression levels. Liver metastases showed a magnitude lower expression level. Therefore, the third cell line represents a completely different, third pattern of behavior Figure 4A.

Spontaneous lung metastasis xenograft model from subcutaneous implantation
The same three, genetically different human colorectal cell suspensions (HT25, HT29, HCT116) were implanted into lumbar subcutaneous localization of three-three adult and newborn scid mice. No metastases were formed in the adult animals, while each newborn developed distant lung metastases. CD44v3 and CD44v6 expression levels of non-metastatic adult and metastatic newborn primary tumours were measured. The results from the three examined human xenografts are summarised on Figure 3C.
The difference in v3 and v6 variant expression level between the metastatic newborn and non-metastatic adult primary tumours is clearly demonstrated. The tendency is uniform across all three CRC tumours in this model, showing 2-3 fold higher v3 and v6 expression levels in the newborn than in adult primary tumours, however the difference with three animals in each arm could just reach the level of significance in the case of HCT116 colorectal tumour type.

Discussion
Despite hundreds of published studies on CD44 within the last two decades, noconsensus opinion has been reached as of today apart from that it plays some role in tumour progression [69,76]. One of the possible explanations is that from microarrays to immunohistochemistry, the overwhelming majority of studies fail to take into consideration that 'CD44' is a collection of isoforms generated via alternative splicing of its nine variable exons. Probes, primers and antibodies against the standard region will recognise a pool of different isoforms as they all share the same standard region, and not the isoform itself, which is functionally active and involved in the metastatic process. A good example of this is colorectal adenocarcinoma, where the cells of origin, i.e. the glandular epithelium of the large bowel, express standard form of CD44 (CD44s), but not variant ones (CD44v), while highly dysplastic colorectal adenomas, primary and metastatic colorectal cancers express a large variety of different-length CD44 isoforms, both on mRNA and protein levels [14,31,44,[77][78][79]. Some of these new CD44 isoforms acquired during malignant transformation contain v3 and v6 domain which add a Figure 3 CD44v3 and CD44v6 expression activity in mouse isograft CRC metastatic and liver colonisation system. A. Semi-quantitative CD44v3 and v6 gene expression activity of primary and secondary C26 colon carcinoma isograft tumours implanted orthotopically (colonic) and intrasplenic (liver colonisation model) into Balb/C mice.While in "proper" metastatic system (orthotopic implantation) liver metastases expressed CD44v3/v6 at significantly higher level than colon primaries, in colonisation model liver colonies showed similar expression profile to spleen primaries. B. CD44 v3 and v6 gene expression activity of primary and secondary HT29 human colon carcinoma xenografts in orthotopic and intrasplenic implantation models to scid mice. Selection of CD44v3/v6 high subclone at primary site can be suspected, as liver metastases in orthotopic implantation model system showed higher expression activities than colon primaries. No "clone selection" effect was detected in liver colonisation model (after intrasplenic implantation). C. CD44v3 and v6 gene expression activity of non-metastatic (implanted into adult) and metastatic (implanted into newborn) primary human colon carcinoma xenograft tumours implanted subcutaneously (heterotopicaly) into scid mice. Expression rates are normalized on beta-actin expression rates and non-metastatic v3/v6 expression rates. All three examined human colorectal carcinoma cell types showed higher CD44v3 and CD44v6 expression activities in metastatic system (subcutaneously implanted colon tumours in newborn mice, which developed distant metastases in each animal cases) than in non-metastatic (the same colon carcinomas implanted into adult mice subcuticularly without any single case of distant metastasis formation).
completely different range of functions to the preexisting CD44s functions, as CD44v6 is a co-receptor of the met protein, and CD44v3 is one of the most important presenters of heparin-binding growth factors, such as HGF, the main ligand of Met [60,62]. One must also consider that multiple variable regions might be present in a single isoform, adding further functional abilities. It is likely that the molecular interactions of CD44v3 are different from those of CD44v2-v10 variant despite both containing the v3 region, yet they will be 'measured' as a single product when using CD44v3 primers/probes/ antibodies. Again, probes against any particular exons /protein domains od CD44 will demonstrate a wide range of CD44 isoforms, all of which share the same particular exon/domain. Furthermore, most of our tests are performed on mixtures of different tumour tissue elements: these assays work on a conglomerate of tumour cells, stromal cells, their proteins or gene products. Results of these reactions will lways represent some sort of average or summation. This might influence the qualitative expression pisture of distinct gene products of a tumour sample, but certainl will affect quantitative one. Considering this makes it easier to understand the contradictions of the literature.
Our aim was to identify a roster of CD44 isoforms expressed by colorectal cancer to see the extent of complexity we need to face when examining 'CD44'. We also wanted to determine whether a potential "malignant" colorectal adenocarcinoma-specific isoform parttern exists. We then aimed to examine the stability of this Figure 4 CD44v3 and CD44v6 expression activity in human xenograft CRC metastatic and liver colonisation system. A. CD44v3 and v6 gene expression activity of primary and metastatic human colon carcinoma xenograft tumours (HT29, HT25 and HCT116) implanted orthotopically (colonic wall)into scid mice. The three genetically different human colorectal carcinoma lines showed three different behaviours in terms of CD44v3 and v6 expression rate changes during metastatic process. The reason of it might be the baseline level of expression in prmary tumours: relative expression rates (CD44v3 and v6 expression normalized on β-actin expression rates) of the three tumour types showed 1-1 order of magnitude differences. As we measure summated expression rates (average) of each tumour cells of the particular tumour mass, differences in CD44 variant exon expression levels might be explined with differences of CD44v3/v6 high cell proportion of the tumour cell mass. B. Model of metastatic clone selection in each three cell lines, schematic explanation of the differences in a consequent model. pattern during tumour progression, i.e. whether it is qualitatively stable or new isoforms become dominant. This was tested in our complex iso-and xenograft experimental animal models. Reproducibility of v3 and v6 changes described in the literature was also tested in our experimental animal models. These models enabled us to separately study steps of tumour progression as well as the role of the tumour microenvironment.
First of all we attempted to identify the expressed isoforms from the qualitative CD44 ASP. We concluded that although not all the possible isoforms are expressed in reality, it is still rather technically impossible to identify the ones that do get expressed.All three attempts of ours on completing the isoform roster were only partially successful: (i) Our A -B primer overlapped the whole CD44 variable region, still just a few "dominant" isoforms could be identified as qualitative PCR products with direct sequencing. Some other isoforms could me identified according to the PCR product length, as well. Further CD44 variants were added by four other PCRs carried out with primer pairs designed onto particular exons of interest (v3 and v6). (ii) Next-generation sequencing theoretically should provide an allele-specific list of all transcribed isoforms. Still we reached limitations of the technique, but managed to identify more, sofar unidentified isoforms. (iii) I our previous publication on similar experimental set-up on melanoma (data under publication, Rásó-Barnett et al., paper accepted) identification attempt of "each" expressed CD44 isoform by cloning was reported to be fruitful, but again far from complete. We feel that the result of CD44 splicing is far richer than the sensitivity and effectivity of today's tests. However, we do not yt know if this complexity of CD44 ASP carries any special functional importance. For now, we need to focus on functional characterization of distinct exons/domains of the complex molecule selectively.
Although the complete CD44 isoform list was not achieved, we identified simplified colorectal adenocarcinoma specific CD44 alternative splice pattern (ASP) which was qualitatively stable across samples from different colorectal cancer cell lines in vitro, and subcutaneous, intrasplenic and orthotopic implanted primary colorectal tumours and their liver metastases/colonies.
Although we know that not all of the expressed isoforms can be detected this way, the overlapping primer pairs we used covered the whole variable region, and provided a simplified pattern (ASP) which could be easily used to follow the possible changes of CD44 spilicing even in more complex in vivo systems. These results proved that, unlike the appearance of new CD44 isoforms during tumorigenesis, no new CD44 isoforms appear during tumour progression from primary tumour to metastasis. Therefore, it was likely that the expression changes of specific variant containing isoforms (as published in the literature) should be rather caused by quantitative changes of a set of isoforms.
Hence, we focused on quantitative expression of two particular exons: v3 and v6. These two exons also appeared to be dominant on our reported qualitative ASP pictures. We separately examined their role in different steps of metastatic cascade in our complex animal experimental model system. Furthermore we managed to gain a glimpse of the role of tumour microenvironment (host) in the modulation of CD44 splice variant expression in conjunction with metastatic potential, as well.
Our isograft model (Balb/C -C26) proved that quantitative changes only happen during real metastasis formation, considering that v3 and v6 expression levels of both orthotopic and heterotopic implanted primary tumours were similar, but increased expression levels were only detected in real liver metastases and not in the colonies. This means that the higher expression level of v3 and v6 is not the effect of the liver microenvironment, but it is the actual pre-existing feature of distinct metastatic cells within the primary tumour. While at orthotopic implantation into large bowel wall tumour cells must undergo each classical steps of metastasis formation (EMT, intravasation, survival in circulation, extravasation, MET, proliferation in metastatic target organ), intrasplenic implantation gives the chance to a proportion of tumour cells to get straingt into the circulation by skipping the early phases of the cascade. Comparison of the two models can potentially differenciate between genes, gene products that act in different in different phases of metastatic process.
The same theory was tested in a human xenograft model as well. HT29 human colorectal adenocarcinoma xenograft showed similar results to the C26 isograft, while both HT25 and HCT116 behaved differently, which proposed further explanation to the contradictions of CD44 literature. We hypothesise that a higher v3/v6 expression level is a necessary, but probably not satisfactory prerequisite of cellular metastatic capability. This is proven by the fact that real liver metastases of all three CRC cell lines expressed v3 and v6 in the same range, despite the different expression levels of their colonic primaries Figure 4A,B. This also means, that the number of cells expressing high levels of CD44v3 and v6 is not necessarily high within the primary tumour, as theoretically a single 'highly expressing' clone is enough for metastasis formation. In a similar experimental model system circulating tumour cells of malignant melanoma were found to express CD44v3 and v6 containing varints on a high level in real metastatic system (Rásó-Barnett et al. under publication) Correlating this with our result would mean that HT29 had the lowest, while HT25 and HCT116 had the higher percentage of 'highly expressing' , metastatically potent cells. This also carries the possibility of a primary tumour showing relatively higher expression levels than its metastasis, which would have already been re-selected by the new microenvironment. Translating this into clinical practice would mean, that the 'lack' of overexpression would not exclude metastatic potential or even a worse prognosis. As primary tumour site seemed to be interesting in terms of selection of metastatically potent, CD44v3/v6 high subclones, we tried to find support os this theory in another, quite different model system.
We further examined the role of the primary tumour microenvironment in our subcutaneously implanted xenograft model. CD44v3 and v6 expression levels were different in tumours implanted from the same cell suspension, into genetically identical host, i.e. newborn and adult scid mice, which were only physiologically different. Hence, differences in expression levels of the examined gene products (v3 and v6) could be correlated with differences in metastatic potential (0% vs. 100%). Microenvironmental (host) factors (such as maturity immune system, vascular permeability, cytokines and chemokines, etc.) should, therefore, stand as driving factors in the background of metastatically potent subclones which will ultimately determine the clinical behaviour of the entire tumour. These experiments prove that physiological factors of host (as well as primary tumour microenvironment) do matter and in extreme, experimental conditions can fully determine metastatic potential of a malignant system.

Conclusions
It is likely to assume that tumour microenvironment (tumour host organism) have a central role in metastatic phenotype presentation of primary colorectal cancer.
While normal tissues (such as colonic mucosa) do not express variant isoforms of CD44, tumours seem to perform a wide variation of CD44 isoforms. Presuming that different domains of different CD44 isoforms can perform a variety of new functions, there is no more sense in investigating CD44 "in general".
Moreover, expression pattern of CD44 isoforms was found to be stable, meaning that colorectal type expression profile remained unchanged from tumour cell suspensions (in vitro) throughout primary and metastatic colon cancer xenografts. In contrary, quantitative changes do exist in the expression of distinct variant exons.
CD44 variant isoforms, especially the functionally well-characterized v3 and v6 containing isoforms, seem to massively take part in expressing the metastatic phenotype. Our results support that higher-level v3/v6 co-expression can represent this "quasy-metastatic-gene" function at the primary tumour site and at the early phase of the metastatic cascade. Nontheless, it must be emphasized that only a minor proportion of the primary tumour mass is sufficient to hold "metastatic-phenotype" for the complete primary tumour. Although each colorectal tumour types use metastatic subclones of high CD44v3/v6 expression rate, selective examination of metastatic tumour cell group is not yet resolved. This means that quantitative evaluation of not only CD44 "in general", but even of v3/v6-containing isoforms are inappropriate for the prognosis of metastatic behavior of a single tumour case because of the summative way of the measurement techniques. Moreover this can be the background of the quite diversive results on the predictive value of "CD44" in the literature.

In vivo experimental models Isograft model
We injected suspensions of C26 isograft colorectal tumour cell line into the spleen (4 animals) [80]and colonic wall (4 animals) of adult (20-week-old, 20gweigh) Balb-C mice. Anaesthesia was performed by intraperitoneal injection of Nembuthal (70 mg/kg). Each tumour injection was performed by 0.05 ml suspension of 5x10 5 cells (10 7 cell/ml serum-free one-cell suspensions). After 4 weeks, animals were sacrificed, autopsy was performed and tumour tissues of primary tumours and liver metastases were isolated Figure 5I.a,I.b.
Human xenograft model I. -spontaneous lung metastatic model from heterotopic (subcutaneous) implantation Single cell suspension of HT25, HT29 and HCT116 cell lines (5x10 5 cells / 0.05 ml) were subcutaneously implanted into adult and newborn scid mice into the same localization (left lumbar region) using the same cell suspensions. Both primary and metastatic tumours were dissected from the sacrificed animals after 3 weeks Figure 5II. This model provides particular chance to examine "host" driven changes in the behavior of cancer cell suspensions: while in newbrn mice pulmonary metastases are being formed in each experimental cases, the same cell suspensions are never becoming "metastatic" after implantation into adult mice. The two systems (permissive and non-permissive) were only different in "host age".
Micrometastases were verified on frozen sections of the lung.

Human xenograft model II -Liver-metastasis models from orthotopic and heterotopic implantation
We performed xenotransplantation of the already mentioned three human colorectal cancer cell lines (5x10 5 cells / 0.05 ml) into the spleen (heterotopic implantation) and into the colonic wall (orthotopic implantation) of 20-25g-weigh adult scid mice. The operations were performed under intraperitoneal anaesthesia with 70 mg/kg Nembutal. Animals were sacrificed after 8-10 weeks when obvious signs of cachexia showed (more than 20% of weight loss of animal occurred). Primary and metastatic tumours were dissected during autopsy Figure 5I.a,I.b.
Samples from primary and metastatic tumours of all three experimental animal models were processed by freezing them immediately after removal for molecular biology.
All animal experiments have been approved by the local Animal Experimental Research Board (TUKEB83/2009).

Total RNA isolation
Tumour tissue was fronzen in liqued nitrogen and homogenized in mortar prior to RNA isolation. Total RNA was isolated with Trizol W (SIGMA) according to  HCT116 and HT25) were intracolonic and intrasplenic injected to scid mice; C26 mouse isograft colon cancer cell suspension was injected into Balb/C mice. In proper "metastasis" system the whole metastatic sequence is represented, while "colonisation" model represents just the later part of metastatic cascade. II. Spontaneous lung metastatsis model of subcutaneous (heterotopic) tumour injection: single cell suspension of human xenograft colon cancer was injected into adult and newborn scid mice. Primary tumours were grown in both cases, while pulmonary metastases were produced only in the newborn. manufacturer's instructions. Possible DNA contamination was eliminated with TURBO DNA-free™ (Ambion).

Reverse transcription
For reverse transcription 1 μl of 10mM dNTP mix (Finnzyme W ) and 1μl of random nonamer-oligo dT combination for a final concentration of 2.5 μM was added to 2 μg of purified total RNA. After incubating at 70°C for 10 min 2 μl of 10x M-MLV Reverse Transcriptase Buffer (Finnzyme W ), 1μl of M-MLV Reverse Transcriptase (200 units/μl, Finnzyme W ), 0.5 μl RNase Inhibitor (40 units/μl, Promega W ) and 6.5 μl RNAse free water was added for the final volume of 20 μl than incubated at 37°C for 50 min and 85°C for 10 min.
PCR products were separated on a 3% agarose gel and detected with Gel Doc 2000 (Bio-Rad W ) after ethidium bromide staining.
Transcribed isoforms were identified by re-extraction, and direct sequencing (Big Dye Terminator cycle sequencing, Applied Biosystems 3130 Genetic Analyzer) of the distinct bands.

Quantitative PCR analysis
For quantitative measurement of CD44v3 and v6 variable exons q-PCR reactions were used. Each 25 μl reaction mixture contained 12.5 μl 2X iQ SYBR W Green Supermix (Bio-Rad), 0.5-0.5 μl of each primer for final concentration of 200nM and 11.5 μl of the diluted cDNA. The cycling conditions comprised 3 min of iTaq™ DNA polymerase activation at 95°C, 40 cycles at 95°C for 30 sec, 55°C for 30 sec, 72°C for 1 min. Starting quantities were defined on the basis of standard five-fold dilution series (x1-625) carried out with control cDNA of A431 (human squamous cell carcinoma). Relative expression of the examined v3 (primers: CGT CTT CAA ATA CCA TCT CAG CA and ATC TTC ATC ATC AAT GCC TGA) and v6 (primers: GGC AAC TCC TAG TAG TAC AAC G and GTC TTC TCT GGG TGT TTG GC) variable exons were determined by normalizing the starting quantities to the housekeeping β-actin (primers: GTG GGG CGC CCC AGG CAC CA and CTC CTT AAT GTC ACG CAC GAT TTC) starting quantities from the same cDNA sample.

Next-generation, allele-specific sequencing
In order to prove identified CD44 isoforms specific to colorectal cancer, next-generation sequencing method was performed. The method briefly is as follows: PCR reaction products from HT29, HT25 and HCT116 human colorectal cell lines (total RNA isolation, purification, reverse transcription as mentioned above) with primer pair A -B (primers designed on 5' and 3' standard region of CD44 framing the entire variable region, hence theoretically amplifying all the transcribed variant isoforms at the same time) were first amplified, purified with High pure PCR clean-up micro kit (Roche W , Mannheim) according to the manufacturer's instructions. The amplicon ends were polished and ligated with roche 454 multiplex identifier (MID) adaptors, so that universal primer binding sites were generated for emulsion PCR. The amplicons were than separated with magnetic ampure beads (Agencourt), non-bound adaptors were eliminated. Emulsion PCR (emPCR) reaction was performed with molarly equivalent amount of sample amplicons according to the manufacturer's protocol (Roche 454 W ) in modified condition of low A,B primer concentration and slow PCR program to minimalize the amplification efficiency differences between longer and shorter variants. CD44 variant isoform library was clonally amplified with pyrosequencing technique (454 GS Junior, Roche W ), under the condition of 5% bead-ratio, 200 sequencing cycles, 400 bases read length mode.
Amplicon variant analyzer software (Roche Diagnostics W ) was used to eliminate multiplicity of the samples after MIDs, to trim the primers and to align the reads to the reference cDNA sequence. For easier handling of the huge database of sequenced isoforms of the critically long CD44 molecule, exon-by-exon fishing technique was adapted to identify each amplified exon-combination splice variants.
Aligment of an exon was accepted to be valid over a threshold of matching rate of 90%. The exon combinations which have more than 50 reads were reported. Similar process was performed by the four other primer combinations (Primer A-D, A-E, C-B and C-E, respectively. See Figure 2A and Methods above).

Statistical analysis
Real-time PCR results, relative expression rates (normalized on beta-actin expression rates) were compared with twosample T-tests (data were showing normal distribution). Statistical significance was assumed at p< 0.05.

Competing interests
There is no competing interest affecting the authors.
Authors' contributions BB and LRB designed and described CD44 ASP, complex experimental animal model system was deisgned by ER, isograft animal experiments, as well as spontaneous lung metastasis experimets were carried out by BT and RE, while orthotopic and colonization xenograft models were carried out by BB. TB and LRB participated in designing and performing qPCR studies, too. PB was performing next-generation sequencing on the basis of the experimental design of ER. Statistical assessment, interpretation of result data were led by ER and JT and performed by BB. Direct sequencing was carried out by TB and ER. ER and JT helped in drafting the manuscript. All authors read and approved the final manuscript.