Brenner and Rector's The Kidney, 8th ed.

CHAPTER 70. Genomics and Proteomics in Nephrology

Joseph V. Bonventre   Stephen I-Hong Hsu   Bernardo C. Vidal Jr.   Asher D. Schachter



Genomics, 2228



Challenges to Genomic Analysis of the Kidney, 2228



Application of Genomics to Salt and Water Physiology, 2229



Genomics and Kidney Allografts, 2229



Biomarker Discovery, 2229



Genomics: Study Design, 2230



Issues of Dimension, 2230



Sample Source, 2230



Defining Eligibility Criteria, 2230



Replicates, 2230



Platform and Chip Selection, 2231



Genomic Data Management, 2231



Normalization and Noise Modeling, 2231



Computational Methods, 2231



Data Interpretation, 2232



Proteomics, 2233



Introduction, 2233



Diagnostic Expression Profiling in Renal Cell Carcinoma, 2235



Diagnostic Expression Profiling in Human Urine, 2236



Biomarker Discovery by Surface-Enhanced Laser Desorption/Ionization-Mass Spectrometry, 2236



Proteomics in Animal Model Studies, 2237



Challenges and Future Prospects, 2237



Completing the Proteome Map of the Human Kidney and Urine, 2237



Advancing Proteomics to the Renal Diagnostics Arena, 2238



From Parts Catalog and Biomarkers to Systems Biology, 2239

Genomics is the study of the entire DNA sequence of an organism. The human genome is believed to consist of approximately 30,000 genes.[1] By its nature, genomics includes the study of structural genes, coding and noncoding DNA segments, and regulatory DNA sequences. Because this represents a very large data set, genomics, as are other “omics,” is computationally complex and has spawned a number of technologies that have been developed to collate vast amounts of information in a manner that allows high through-put, to discover associations among the information, and to assign a physiologic or pathologic significance to the information. Genomics has, in turn, led to physiologic genomics and comparative genomics. Physiologic genomics represents a link between genome information mining and better understanding of integrative physiology and clinical medicine.[2] Comparative genomics is the study of the genomes of multiple species. Usually, this is done to learn about the relative importance and, perhaps, functional significance of a DNA sequence, from the ability of this sequence to survive millions of years of evolutionary pressures and to be conserved across species. For example, the fact that bacterial genes were found in eukaryotes encoding mitochondrial and plastid proteins led to the theory that mitochondria and chloroplasts derived from a symbiotic relationship between free-living bacteria and an ancestor to the eukaryotic cell ultimately leading to the descendents of the bacteria being incorporated into the eukaryotic cell.

Genomics has been applied to many areas of kidney biology and pathology including renal development, injury, biology and epithelial cilia, and transplantation. In some cases, the main purpose is to understand physiologic or pathologic processes. In other cases patterns of gene expression are sought that define a particular disease state or phenotype. In yet other applications specific genes identified using genomic approaches are explored in depth to understand their potential roles and the roles of encoded proteins in processes related to kidney health or disease.

Differentially expressed genes constitute a rich source of potentially valuable true or surrogate biomarkers of renal disease processes. Although considered further removed from actual disease processes in comparison with proteomics, the field of genomics has evolved to include an impressive range of robust high throughput platforms and bioinformatics tools that are now more than ever at the fingertips of translational researchers. Owing to constraints on the length of this chapter, we limit this section on genomics to examples of how these powerful approaches have been used in the kidney. In the second part of this section, we discuss the fundamentals of genomic studies, focusing on study design, data management, computational methods to address specific questions, and data interpretation so that the reader can both evaluate published studies as well as design informative new approaches to study physiologic and pathophysiologic processes in the kidney.


Challenges to Genomic Analysis of the Kidney

The kidney is a particularly challenging structure for genomic analysis. There are at least 26 morphologically and functionally distinct cell types. The risk of using cultured cells derived from the kidney is that that dedifferentiation and transdifferentiation occurs in culture that fundamentally changes cell metabolism and functional characteristics. These changes make it questionable as to whether the genomic patterns of the cultured cell truly reflect the characteristics of gene expression patterns in vivo. Furthermore, in culture, there is loss of environmental cues that govern cell behavior in vivo. When comparing, for example, native mouse-collecting duct cells to immortalized-collecting duct principal cells, there are marked differences in the transcriptome.[3] Analysis of transcriptional profiles of single cells, groups of cells, or nephron segments isolated directly from the intact nephron has been aided by the technique of laser capture microdissection and laser-manipulated microdissection techniques.[4]

Kidney Development and Cell Specification

Cell specification of the kidney relies on the expression of specific genes in a complex spatially and temporally coordinated network.[5] Understanding the genomic programs that specify cell fate is extremely important in understanding how the kidney develops in normal and diseased states, and how stem cells might be involved in this development. Furthermore, because repair processes recapitulate aspects of renal development, a better understanding of the genomic patterns of the developing kidney may effectively guide understanding of repair and how we might facilitate repair when it is advantageous (such as in proliferation of proximal epithelial cells after injury) and block repair when it is detrimental (such as when interstitial fibroblasts proliferate in tubular interstitial disease).

Doucet and colleagues[3] have studied the transcriptomes of various nephron segments that have been microdissected. They have adapted the serial analysis of gene expression (SAGE) method for down-sized extracts, to create a method called SAGE Adaptation of Downsized Extracts. In contrast to microchip analysis where mRNA of tissue extracts is hybridized to either cDNA or oligonucleotides attached to a solid support, SAGE relies on preparation, cloning, and sequencing of cDNA concatemers to create an mRNA gene expression profile.[6] Unlike microarrays, SAGE does not depend on a prior knowledge of the genes to be analyzed. There is considerable axial heterogeneity of gene expression not only of genes involved in physiologic functions such as ion and water transport but also of genes involved in the control of cell division, differentiation, and apoptosis. Cluster analysis resulted in the same gene kinship patterns when all expressed sequence tags were included, as it did when only those tags associated with proliferation and differentiation were included. This suggests that there may be a causal relationship between specific genes involved with differentiation and proliferation and establishment of the final functionally distinct differentiation status of the renal cell type.

Application of Genomics to Salt and Water Physiology

Knepper and colleagues[7] have used the Brattleboro rat, which lacks endogenous vasopressin to identify genes expressed as a result of long-term administration of vasopressin. In this hypothesis-generating approach, a number of previously unrecognized vasopressin responsive genes were identified and confirmed as coordinately regulated at the protein level also. This study was particularly interesting in that it found a number of new regulated genes even though it only evaluated 1176 genes on the microarray.

As an example of another approach, Gumz and colleagues[8] evaluated the effects of aldosterone on a mouse inner medullary collecting duct cell line. They evaluated 12,000 genes and looked at expression changes 1 hour after addition of aldosterone. They found that glucose-regulated kinase sgk, connective tissue growth factor, period homolog, and preproendothelin were all upregulated. Whereas it had been already known that sgk is an aldosterone-sensitive kinase, the other genes were unexpected and may, therefore, be novel functionally related genes. For example, connective tissue growth factor is implicated in progressive fibrosis and angiogenesis, and may contribute to progressive tubular interstitial disease that is known to be alleviated by aldosterone blockage with spironolactone in a number of conditions.[9]

Firsov[10] has applied SAGE to identify vasopressin and aldosterone-regulated (4-hour exposure) transcripts in the mpkCCDcl4 cell line derived from the principal cell of the mouse collecting duct. He found marked up-regulation of glucocorticoid-induced leucine zipper mRNA among 63 aldosterone-regulated transcripts. This gene has been implicated in anti-inflammatory and immunosuppressive effects of glucocorticoids in T cells, but a role in the kidney has not been established. Firsov also found 59 vasopressin-regulated genes. As these examples demonstrate, genomic approaches rapidly generate a great deal of data and a large number of hypotheses to test. The challenge then lies in identifying the hypothesis that is most likely to be relevant.

Genomics and Kidney Allografts

Genomic approaches have been employed to examine expression patterns in kidney biopsies that could be used to predict rejection or guide therapy. Sarwal and associates[11] found consistent differences in gene expression profiles associated with acute rejection, nephrotoxic drugs, chronic allograft rejection, and normal kidneys. There are a number of other published studies of differences identified among patients with various forms of kidney disease in the allograft, although it is difficult to dissect mechanisms from these studies because the number of genes whose expression is changed is quite large. [12] [13] Suthanthiran's group [14] [15] [16] [17] [18] has demonstrated that informative mRNA transcripts can be recovered from urine and reliably measured to detect viral and immune events in renal transplantation. A recent review summarizes array-based methods for diagnosis and prevention of transplant rejection.[19]

Biomarker Discovery

Gene expression patterns in blood cells, tissue, and urine have been used to identify biomarkers that could potentially be used to better diagnose onset of disease, monitor progression or regression of disease, and predict outcome. Gene expression patterns have been used for the early prediction of drug-induced pathology in the kidney. [20] [21]

Using representational difference analysis, a polymerase chain reaction (PCR)–based technique,[22] to compare gene expression in a normal versus postischemic rat kidney, and to identify genes that are up-regulated with renal ischemia, [23] [24] we have cloned kidney injury molecule-1 (Kim-1). Kim-1 (or KIM-1 in humans) encodes a type I membrane glycoprotein containing, in its extracellular portion, a novel six-cysteine immunoglobulin-like domain and a T/SP rich domain characteristic of mucin-like O-glycosylated proteins, suggesting that this molecule is involved in cell-cell or cell-matrix interaction.[25] We found that the mRNA is localized primarily to the S3 segment of the proximal tubule in the rat after ischemia/reperfusion. Antibodies raised to the protein confirmed very high levels of Kim-1 protein expression in the S3 segment with undetectable expression in normal kidneys. KIM-1 protein was also markedly up-regulated in the human kidney with acute kidney injury, and the ectodomain is released into the urine of rodents and man with injury. [24] [26] To evaluate the utility of Kim-1 as a biomarker for other types of renal injury, Kim-1 expression was determined in the rat in response to three different types of nephrotoxicants: S-(1,1,2,2-tetrafluoroethyl)-l-cysteine (TFEC), folic acid, and cisplatin. Marked increases in Kim-1 expression were confirmed by immunoblotting and immunofluorescence in all three models. Furthermore, Kim-1 protein was detected in urine of toxicant-treated rats. The temporal pattern of expression in response to TFEC is similar to the Kim-1 expression pattern in the postischemic kidney. After folic acid treatment, Kim-1 protein is present in the urine, despite no significant increase in serum creatinine. Cisplatin treatment results in early detection of urinary Kim-1 protein and diffuse Kim-1 expression in S3 cells of the proximal tubule. Kim-1 can be detected in the tissue and urine on days 1 and 2 after cisplatin administration, occurring before an increase in serum creatinine. We then developed a sensitive quantitative urinary test to identify renal injury in the rodent to facilitate early assessment of pathophysiologic influences and drug toxicity.[27] Urine samples were collected from rats treated with one of three doses of cisplatin (2.5, 5, or 7.5 mg/kg). At one day after each of the doses, there was an approximately three- to fivefold increase in the urine Kim-1 ectodomain, whereas other routinely used biomarkers measured in this study (plasma creatinine, blood urea nitrogen [BUN], urinary N-acetyl-beta-glucosaminidase [NAG] , glycosuria, proteinuria) lacked the sensitivity to show any sign of renal damage at this time point. When rats were subjected to increasing periods (10, 20, 30, or 45 min) of bilateral ischemia, there was an increasing amount of urinary Kim-1 ectodomain.

A large pharmaceutical consortium, working independently as part of the International Life Sciences Institute working group on the application of genomics and proteomics, confirmed our finding. This consortium used an unbiased genomic approach to evaluate genes up-regulated with the nephrotoxin cisplatin. This group determined that Kim-1 mRNA was one of the genes most highly upregulated among 30,000 genes tested.[28] In situ hybridization confirmed that Kim-1 expression was very low in control kidneys and highly induced in cisplatin-exposed kidney. Some other kidney diseases that Kim-1 gene or protein are up-regulated are cyclosporin nephrotoxicity [29] [30]; protein overload nephropathy where the Kim-1 protein was found at the apical membrane of dilated nephrons and in areas with inflammation, fibrosis, and tubular injury[31]; homozygous Ren2 rats, in which renal damage is induced by excessive activation of the renin-angiotensin system[32]; anti-thy-1 nephritis[33]; brain-dead donor rat kidneys[34]; and polycystic kidney disease.[35] This list keeps growing as more and more models of injury are examined by our laboratory and others.

A number of other markers of renal injury have been proposed as the result of large-scale microarray studies or kidney injury responses to agents such as mercuric chloride, 2-bromoethylamine, hydrobromide, hexachlorobutadiene, mitomycin, amphotericin, and puromycin,[36] anti-thy-1 nephritis,[33] ischemia/reperfusion. [37] [38] [39] [40] [41] [42] [43] We have recently reviewed the growing field of mechanistic biomarkers for cytotoxic acute kidney injury.[27]

Genomics: Study Design

Given the cost and time involved, genomic studies must be carefully planned before the first sample is collected. This plan should include a detailed bioinformatics plan describing what questions are being asked and how those questions are addressed at both the study design and computational levels. Too often, the data analysis design is added post hoc, when all the samples have already been collected, preserved, processed, and assayed (usually at great cost), only to discover that important parallel controls or replicates were not performed, or more commonly that “pooling” of samples obliterated any possibility of vital noise modeling (discussed further in the next section). Before embarking on a large genomic study, it is crucial for the investigator to understand that bioinformatics tools are not developed for the purpose of cleaning up data generated by poorly designed studies. The mantra “garbage in, garbage out” will always be true for even the latest, technologically innovative computational methods. It is at the study design phase that the investigator has the maximum leverage for controlling data quality.

Issues of Dimension

A major difference between genomic studies and more traditional clinical studies is the size of the y-dimension, meaning the length of the column of data representing the number of measurements (genes), versus the size of the x-dimension, which is the number of samples or subjects. Conventional sample size and power calculations no longer apply. One major goal of high throughput, massively parallel studies is to generate hypotheses in the form of profiles, or fingerprints indicative of a specific disease state, and not to measure a handful of genes of prior interest. Investigators who already know which genes they are interested in would be better off using more conventional and less expensive measurement methods.

Sample Source

The source of the samples to be assayed has a major impact on the quality of the results. Profiling peripheral white blood cells is a rational approach for immune-mediated systemic renal diseases. More specific profiling information can be obtained if the cells are sorted before gene expression analysis. Similarly, genomic studies of renal biopsy tissue are clouded by the heterogeneous cellular composition of the glomerulus, tubular segments, interstitium, and vessels. Laser capture microdissection is an established approach to narrow down the number of cell types that are profiled, and should be considered a potentially valuable adjunct to intrarenal genomic studies. The urine can be a rich source but is also subject to heterogeneity associated with the presence of various cell types and variations in the physical chemical properties of the urine. The investigator designing a genomic study must consider the biologic information that the tissue source or sources of interest can and cannot provide, as well as any inherent confounding factors that can significantly affect the results, such as cell heterogeneity, protein binding, and pH of the sample.

The timing of sample collection is another important variable to consider. The level of expression of genes may vary over the course of the day depending on certain factors. Although it is often unrealistic to anticipate the exact time that clinical samples are made available, collecting and recording potential confounding variables such as food intake, blood pressure, activity level, medications (dose, timing, pharmacokinetics if available), and even just the time of day as in circadian cycles, can strengthen the conclusions of the study and perhaps identify alternate hypotheses not previously considered.

Defining Eligibility Criteria

Kohane, Kho, and Butte[44] emphasize the importance of “exercising the expression space” to optimize the ability to detect significant differences in gene expression patterns. For translational studies, this means paying close attention to eligibility criteria definitions. Inclusion and exclusion criteria should be defined to achieve several (at times conflicting) goals:



Recruitment of distinct, clean, consistent, and relatively homogeneous cohorts.



Maximizing the potential differences between the cohorts and control groups with respect to the specific aims of the study (exercising the expression space).



Ensuring that criteria are not overly strict to the point that subjects are rarely eligible to participate.

Furthermore, once eligibility criteria are defined, investigators must be resolved to adhere to the criteria, or risk reducing the quality of their data, and violating Institutional Review Board regulations.


In microarray studies, “replicates” refers to split-sample assays, in which a single sample is analyzed on (preferably) three or more separate but identical assays (same platform, chip type, reagents, and so forth). The most common reason why this is not done is cost constraints. However, replicate data provides valuable information on intra-assay noise. With this information, the bioinformatician is better able to distinguish true differential gene expression from artifact (see section on noise later in this chapter). The cost need not be prohibitive because a subset consisting of a smaller number of samples can be randomly selected for replicate assays and noise modeling, as long as the same assay and platform conditions are used for the entire dataset. The consequences of not performing replicate studies as part of a large microarray project are potentially much more costly in the sense that the quality of the resultant information from the entire dataset will be reduced.

Platform and Chip Selection

There is no single best platform or chip type. The specific aims of the study most accurately dictate which platform is best suited to address the questions of interest. Once a platform is selected, attention should be paid to ensuring that a single generation of chip is used, if possible. Although there are methods for integrating data from different generations of chips,[45] the cleanest solution is to plan ahead to use a distinct chip generation.

The impact of genomic array platform differences is an active area of research. [46] [47] [48] In general, consistency is optimal when standardized procedures are used for sample labeling and processing, and particularly when a single platform is used. The MicroArray Quality Control project is performing studies that clarify the relevant issues.[48]

Genomic Data Management

Data Types and Formatting

There are four main types of data:



Text: for example, a clinic note or operative report



Binary: yes or no (commonly notated as 1 or 0)



Discrete: classes, categories, rankings, grades, and so forth



Continuous: real numbers

A bioinformatics plan that addresses the questions for given study should take into account the data types that are generated. Modeling approaches that operate on continuous data may not be optimal for studies that include binary or discrete data types. Furthermore, recording each data type appropriately in clinical studies markedly improves the speed and accuracy of data analysis. Most importantly, data types should not be intermingled in a given data column. The widespread mixing of data types in spreadsheets that are used as databases has unfortunately rendered many potentially valuable datasets difficult to access. Comments should not be entered in a column that contains a different data type. Whereas text parsing is an ongoing area of development in the field of medical informatics, it is not an ideal solution in comparison to proper data formatting. High-throughput genomic studies generate data output files that can be computationally restructured to facilitate data analysis, but this restructuring can be very time consuming if there are many individual chip files with proprietary tags or locks.

Data annotation standards markedly improve the quality of information that can be learned from a single dataset and from combined datasets that are publicly available. The MIAME standard is defined as the Minimum Information About a Microarray Experiment that is needed to enable the interpretation of the results of the experiment unambiguously and sufficient to reproduce the experiment ( The Gene Expression Omnibus (GEO, is one of the resources of the National Center for Biotechnology Information. GEO is a repository for a large number of MIAME-compliant publicly available genomic datasets. The value of adhering to data-formatting standards was recently demonstrated by Butte,[49] who leveraged the resources in GEO to create phenome-genome networks. Kim and colleagues[50] demonstrated how the use of standardized datasets could improve the information gleaned from small sample sizes. As more “omics” emerge, the appropriate use of data-formatting standards will optimize the quantity and quality of information that one can learn from a wide range of studies.

Normalization and Noise Modeling

Normalization is a technique that improves the comparability of samples within a dataset. Normalization procedures are generally based on statistics and probability theory. One common method transforms each sample's gene expression values such that each sample has a mean of zero and a standard deviation of one. A plot of one sample's gene expression values against a second sample's value appears identical before and after normalization except that the scales on the axes will change. Normalization reduces the chance that between-chip noise is interpreted as significant differential gene expression.

Quantifying noise and understanding its sources is probably the single most crucial factor in genomic studies. Without an understanding of noise, it is impossible to detect meaningful signals. Study design factors that can affect noise modeling were discussed previously. Procedural consistency at every level of investigation, from sample collection though to interpretation and data analysis is the clinical investigator's major weapon for reducing noise.

With respect to genomic studies, the importance of replicates cannot be understated. The tendency to pool samples is even more detrimental than a lack of replicates. Pooling samples in genomic studies is never of any benefit to study design or data analysis. Information is always lost when samples are pooled. As an example, imagine a study comparing gene expression in groups A versus B. Each group has 10 subjects. In study 1, group A's samples are pooled into a single sample, and the same is done for group B. Gene x is measured as having a mean expression level of 2584 in group A and 3825 in group B ( Fig. 70-1 , left side). Is gene x expression significantly higher in B versus A? It is impossible to tell without some measure of the intragroup variability, such as standard deviation or variance. Furthermore, the results are misleading: In contrast to the differences in means, the median expression values for gene x are 2541 and 2315 in A and B respectively (see Fig. 70-1 , right side). So now the expression of gene x appears to be lower in B, based on the very same data points, a conclusion which is lost when samples are pooled.

FIGURE 70-1  Pooling samples can lead to misleading conclusion. Plots show expression values for a hypothetical gene “x” in groups A and B, with 10 subjects in each group. Only the mean value is observable with the pooled data (left side), suggesting that expression of gene x is higher in the B group. However, boxplots (right side) show that the median values suggest that, in fact, the opposite conclusion is correct and the different data distributions clarify why the means differ from the medians.


An additional disadvantage of pooling is that it precludes the investigator from proposing alternate hypotheses related to novel subgroups of patients. Perhaps the categorization of subjects into groups A and B is flawed or overlooks a key distinguishing variable that has not yet been considered. Figure 70-2 shows the same two example groups as above (with five gene profiles added per subject) projected into the top three dimensions of principal component space. Clearly four of the 10 subjects in the B group are very tightly clustered, with all 10 subjects in the A group. The remaining six subjects in the B group are scattered. However, three of the remaining six B group subjects appear to be on an exponential trajectory extrapolated from the A group. These findings would suggest that genomic profiles may define alternate and perhaps novel categorization of subjects into two or more different groups. This information would have been completely lost if the samples had been pooled.

FIGURE 70-2  Pooling samples masks valuable, potentially novel information. Principal component analysis shows that group A samples (purple circles) are closely clustered together, and that some group B samples (green circles) might in fact be more closely related to group A samples, suggesting that novel genomic-defined subgroups may be present. This level of analysis is not possible with pooled samples.


Computational Methods

The advent of massively parallel genomic profiling technologies popularized the hypothesis generation approach to biomarker discovery, in which a wide net is cast, followed by a search for the nuggets of information embedded within the huge volumes of generated data. The development of these high-throughput technologies spurned active innovation of computational methods to visualize the data and extract manageable information. A multitude of bioinformatics tools are now available to address a wide range of questions in genomics and biomarker discovery.

Biomarker discovery algorithms can be categorized according to desired goals:



Identification and validation of diagnostic or prog-nostic biomarkers: supervised learning (classification) algorithms.



Identification of biomarker profiles that define novel pathophysiologic categorization of a disease process: unsupervised learning (clustering) algorithms.

Supervised Learning

In supervised learning, the algorithm attempts to “learn” how to predict an outcome based on a set of input values. The algorithm learns from a training set, which consists of numerous and varied examples of input values and their corresponding observed output values. The trained algorithm is then tested on an independent test set in that only the input variables are provided, and the algorithm's output (the predicted outcomes) is compared against the test set elements' true (observed) outcome values. The sensitivity and specificity of the trained model can then be calculated. Examples of supervised learning algorithms include artificial neural networks,[51] support vector machines,[52]and genetic algorithms (optimization algorithms).[53]

Unsupervised Learning

In unsupervised learning, the algorithm tries to find patterns within a dataset, instead of trying to learn how to predict the “correct answer” based on a training set in which the “correct answer” or outcome is known for each element. Within unsupervised learning, there are three classes of techniques, including: (1) feature determination (determine genes with interesting properties) without specifically looking for a particular pattern determined a priori, such as using principal component analysis[54] and vector algebra[55]; (2) cluster determination (determine groups of genes or samples with similar patterns of gene expression), using nearest neighbor clustering, k-means clustering, or one- and two-dimensional dendograms; and (3) network determination (graphs representing gene-gene or gene-phenotype interactions), using boolean networks, bayesian networks,[56] and relevance networks.[57]

Combining Tools

Each algorithm has particular strengths. It is sometimes advantageous to create a synergistic analysis by combining two or more methods. For example, a genetic algorithm can be used to determine the set of input values that result in maximum separation of samples or genes in the top three dimensions of principal component space[58] or to optimize the inputs going into an artificial neural network[59] or support vector.[60] No single tool is ideal for all situations, and the use of single or combination algorithms should be tailored to the data types, study design, and the questions at hand.

Data Interpretation

Cutting-edge algorithms are of no use if clinical investi-gators do not gain applicable information from the results. Microscopic dendograms placed over multicolored chip images are not intuitively informative. Reducing a list of 30,000 genes to a list of 1500 genes is a good start, but the human brain cannot integrate the potential relationships and subrelationships between those 1500 genes. Furthermore, it does not make much sense to perform genomic profiling for thousands of genes over dozens of samples only then to focus on a handful of genes in the resultant profiles because those genes were of prior interest to the investigator. If the investigator already knows which specific genes are of interest, more conventional (and less expensive) gene expression techniques such as real-time PCR (RT-PCR) will yield more reproducible results. However, if the investigator is interested in finding genes that are coordinately expressed with an existing set of genes of interest, microarray studies can be of value.

Functional clustering tools give a broader, more accessible understanding of large-scale genomic results. One such tool is the Expression Analysis Systematic Explorer (EASE). EASE is freely available online ( EASE uses the Gene Ontology database to categorize and cluster genes by function or pathway, scores each cluster, then ranks the clusters based on the score. Many of the top-ranked clusters however tend to be very large, nonspecific categories. A more robust system is the proprietary Ingenuity Systems Pathways Analysis, which requires a paid subscription. The value of functional clustering methods is to clarify the predominant pathways and processes that are represented by the thousands of genes in large datasets. Thus, the investigator can, for example, claim that there is differential expression of angiogenic, apoptotic, and specific G-protein signal transduction pathway genes in the dataset.

Public Data

The growing volume of publicly accessible genomic data provides opportunities to explore combinations of other datasets, and compare one's own data with other datasets. The implementation of data formatting standards such as MIAME has greatly facilitated the value of public data. In addition to GEO there is also the Pharmacogenetics Research Network's PharmGKB database, which is an “integrated knowledge base for pharmacogenetics linking phenotypes and genotypes” ( Investigators with a goal to answer very specific questions may find these resources to be of limited value because the goals of the investigator likely differ from those of the database curator. However, these resources have intrinsic value to the enterprising bioinformatician interested in discovering novel information. For example, Butte and Kohane[49] used publicly available datasets and data annotation standards such as MIAME and the Unified Medical Language System to create phenome-genome networks, relating genomic data to clinical phenotype data on a large scale.

Computational Programming Environments

Manipulating genomic datasets in an electronic spreadsheet is neither practical nor advisable. Two commonly used analytic software tools are MatLab (The MathWorks, Inc), and R (R Foundation for Statistical Computing).[61] Both MatLab and R have steep learning curves, but if used consistently, either one will become a bioinformatician's most valuable tool. MatLab is a proprietary tool with excellent graphing capabilities and many useful “toolboxes” that are under continued development. R is a powerful, free, open source statistical programming environment that also has excellent graphing functions, a vast array of bundled mathematical, statistical, and modeling functions, and is an integral component of Bioconductor (, which is an open source bioinformatics resource development project. For example, one can freely download platform specific microarray R tools and “vignettes” from Bioconductor.



Proteomics generally refers to the study of the entire complement of proteins (the proteome) expressed in a biologic functional unit (i.e., cellular organelle, cell, tissue, organ, or organism). A major motivation for such a specialized discipline alongside genomics is the realization of the limits of biologic knowledge that may be derived from genes and gene transcripts alone. The flow of information from genes to proteins is mediated by a number of regulatory networks and modifications, such that knowledge of the genome alone may be insufficient to predict the entire proteome. Because an understanding of molecular biologic processes is ultimately gained from exquisite and detailed knowledge of protein structure and function, it follows that proteomics represents an important, distinct, and complementary discipline in the postgenomic era.

A relatively young discipline, proteomics' rapid development has been greatly aided by improvements in methods and technology in the high-throughput analysis of proteins and their chemical modifications (e.g., proteolysis products, chemical cleavage, covalent modifications, and so forth). Thus, the term proteomics is alternately used to denote a host of different techniques that are capable of analyzing complex protein mixtures, as opposed to single-protein assays. These techniques include two-dimensional gel electrophoresis (2DE), surface-applied or chip-based mass spectrometry (e.g., Matrix-assisted laser desorption ionization mass spectrometry, MALDI-MS; surface-enhanced laser desorption ionization [MS, SELDI-MS]) and liquid-separation interfaced mass spectrometry (LC-MS and CE-MS). In general, these techniques are employed to achieve separation of the constituent proteins, followed by their detection and identification. The last step commonly uses mass spectrometry to obtain a mass signature of the enzyme-digest products of proteins, termed the peptide mass fingerprint (PMF), or a peptide sequence (a sequence tag) by tandem mass spectrometry (MS/MS). (See Fig. 70-3 for a brief illustration of these methods.) They constitute the main arsenal in the large-scale effort to catalog proteins expressed in both kidney tissues and urine.

FIGURE 70-3  (A) 2DE map of the human kidney glomerulus ( The spot highlighted in red was identified by Yoshida and co-workers[64] as belonging to beta-microglobulin. Inset shows an MS/MS spectrum matched by the Global Proteome Machine ( to an amino-acid sequence of a tryptic peptide derived from β-microglobulin, illustrating one of two standard routines in identifying 2DE spots. 2DE, two-dimensional gel electrophoresis; MS/MS, tandem mass spectrometry. (B) A mass spectrum obtained by SELDI-MS, showing peaks in the less than 10-kDa region of the normal urinary proteome, captured on a hydrophobic ProteinChip (picture shown on the background). Using bioinformatics, potential diagnostic features are mined from these raw data to discriminate between disease and normal, or between related but clinically distinct diseases. SELDI-MS, surface-enhanced laser desorption/ionization-mass spectrometry.


The Kidney Proteome

A paramount goal of proteomics is the identification of the comprehensive set of proteins expressed by a cell, tissue, or organ in various contexts (in health and disease, developmental stage, stress response, and other states of perturbation). Much effort has been devoted to constructing a reference set, for example, in normal health, against which the dynamics of protein expression may be measured. The kidney, with its tissue compartments consisting of highly diverse cell types, has been the subject of numerous such studies. Most of them employ 2DE because of the desirability of producing a compact visual presentation of separated proteins in gel images, often referred to as a reference map. The human renal cortex is one such part of the kidney that has been the focus of proteomic identification. The work of Magni and co-workers[62] has so far produced the largest set of identified proteins (89 unique proteins and 74 isoforms) using a combination of antibody-assisted enrichment of endothelial cells in the sample and 2DE with PMF. This work expands on an earlier and smaller list of proteins expressed in the renal cortex identified by Sarto and co-workers,[63] which included some 28 unique proteins and 17 isoforms. More recently, Yoshida and co-workers[64] completed a 2DE reference map for the normal human glomerulus, using highly purified samples of glomeruli individually identified and harvested under phase-contrast microscopy. Their effort resulted in 212 proteins identified by both PMF and MS/MS.

Other global identification studies have focused more on differences between proteins expressed in different functional cell types, tissues, or anatomic locations of the kidney, in order to highlight proteins that are related in function or expressed in specific regional structures. Witzmann and co-workers,[65] combining 2-DE with PMF and MS/MS, Western blotting, and analogous-spot position matching, were able to identify 14 unique proteins and protein variants differentially expressed between rat kidney cortex and medulla. Arthur and co-workers[66] expanded this list to include an additional 16 proteins out of 54 identified by PMF. A few proteins showed inconsistent trends between the two studies (e.g., actin was more abundant in medulla in the former study but not significantly different in the latter)—likely a consequence of employing different sample sources (cytosolic sample as opposed to whole-cell homogenates), but that may also reflect errors inherent in the process of composing and matching separate gel images. The technique of differential 2DE (or DIGE) obviates this requirement of matching separate gels, because sample and reference are run simultaneously on the same gel after being labeled separately with either radioisotope or fluorescence labels. Hoffert and co-workers,[67] using cyanine dyes (Cy3 and Cy5), applied this technique to distinguish fractions enriched with inner medullary collecting duct (IMCD) cells from an IMCD-depleted fraction. Out of 85 identified protein spots, 50 were shown to differ in abundance between these two fractions, with those enriched in IMCD having potential roles in maintaining the collecting duct structure, AQP2 trafficking, osmotic stress response, and peptidase activity.

The Urinary Proteome

The urine is another important biologic specimen for proteomic studies, perhaps surpassed only by human plasma in terms of its potential as a diagnostic “gold mine.” Urine has the advantage of being a bodily fluid that is readily obtainable in large quantities and at frequent intervals, important when the clinical diagnostic goal is to closely monitor rapid biologic responses in patients during the course of disease as well as during therapy. The usefulness of urine in the diagnosis of renal diseases is often assumed on the basis of the abnormal excretion of proteins associated with both the disease process and the host response (e.g., inflammatory kidney diseases and the release of mediators of inflammation into the urine). As is rapidly becoming abundantly clear, however, detecting and identifying the multitude of proteins present in the urine is a difficult and unique technical challenge, made so by the wide range of abundance of its constituent species, the highly variable fragments that are produced from the proteolysis of its major component proteins, and the presence of abundant nonprotein molecules such as urea, salt, and other excreted metabolites.

In their early attempt at unraveling the urinary proteome, Davis and co-workers[68] and Spahr and colleagues[69] applied a multidimensional MS/MS approach to whole trypsin-digested urinary proteins, whereas Pieper and co-workers[70] used sample prefractionation followed by 2DE and spot identification by both PMF and LC-MS/MS. The combined distribution of proteins identified in urine by these two methods ( Fig. 70-4 ) shows that classic plasma proteins constitute a mere 46% of the total, whereas the rest mostly classify under cytosolic (22%) and membrane (11%) proteins, thus revealing a much more complex picture of the urinary proteome than one that is attributed to being a plasma subfraction. An important study by Pisitkun and co-workers[71] suggests a potential source of these cytosolic and membrane proteins by examining exosomes secreted into urine. A proteomic profile obtained from SDS-PAGE and LC-MS/MS of these membrane vesicles derived from multivesicular bodies identified 295 proteins in a single subcellular component, highlighting the importance of deep interrogation of the urinary subproteome.

FIGURE 70-4  The distribution of the total, nonredundant urinary proteins identified by Davis and co-workers[68] and Spahr and associates,[69] who used a shotgun MS/MS approach, and Pieper and co-workers,[70] who applied 2DE with PMF. Only about 25% of the total identified proteins (or 54 of 220 unique gene-products) represented are overlapping in the studies. The significant proportion of nonplasma proteins in them, about 50% of the total, indicates a urinary proteome much more complex than was previously assumed. PMF, peptide mass fingerprinting.


The notably small proportion of overlapping proteins (25%) identified by the two distinct proteomic approaches described above points to an inherent technical bias resulting in differential “coverage” of the urinary proteome. The missed identifications by Pieper and co-workers[70] may reflect a trade-off between a reduced complexity and abundance range on the one hand, and sample losses on the other, resulting from the more extensive prefractionation and immunosubtraction of highly abundant species. An extreme case is illustrated by Oh and co-workers,[72] wherein sample preparation of normal urine resulted in a mere 6% of the total protein amount being recovered during the final analytical (2DE) step, from which 113 spots were assigned PMF identifications. The extensive sample preparation included two dialysis steps (four orders of dilution) and depletion of albumin. The selective overrepresentation of subsets (in terms of size and pI) of urinary proteins observed when using either acetone precipitation or ultracentrifugation to concentrate urine[73] also suggests a limitation in the coverage of the proteome resulting from use of any one method of analysis. A recent systematic evaluation of various preparative methods has shown that precipitation by 90% ethanol gives the highest protein yield, whereas acetonitrile precipitation appeared to produce the greatest number of spots in a 2DE analysis.[74] Solid-phase extraction of urinary proteins suggests a shorter and alternative preparative route for 2DE than that already established through precipitation.[75]

A recently reported approach to sample enrichment, that has been applied with good results to plasma samples, employed large combinatorial libraries of ligands to affinity-capture polypeptides from urine. Castagna and co-workers[76] equilibrated urine samples with hexameric peptides conjugated to solid-phase supports and the captured polypeptides released by strong elution buffers. This procedure resulted in 383 unique gene-product identifications, the highest so far achieved in urine proteomics by 2DE analysis, and greater than twofold higher than any result previously attained by a similar method. Despite this feat, the degree of overlap of their data set with that of previously cited studies is small—approximately one third of total identified proteins. Thus a significant number of proteins identified by previous studies (approximately 270) were missed even by the ligand-library method—a method that initially held the promise of a substantially greater depth of analysis, believed to be otherwise inaccessible to pre-existing methods, owing to the predicted high diversity of potential ligands.

Proteomics in Renal Disease and Health

In the clinical diagnosis and treatment of kidney disease, a major priority is the identification of disease-associated biomarkers that may find application in the population-based preventive screening of early kidney disease, in the early detection of acute kidney injury, in the noninvasive diagnosis of acute renal allograft rejection and in the specific noninvasive diagnosis and prognosis of primary and secondary renal diseases, to name just a few examples. The requirements of an ideal disease biomarker, as listed in Table 70-1 for the case of acute kidney injury, are difficult to satisfy, explaining why there has been a notable and general lag in the introduction of new biomarkers to the clinical setting.[77] Achieving this ideal may entail a shift from the one marker-one disease concept that is the basis for most existing clinical biomarkers, to the paradigm of a multimarker diagnostic signature that relies on a combination of multiple qualitative and quantitative biomarkers to discriminate between normal versus disease states or distinguish between related but distinct diseases. In so far as proteins are the direct effectors and mediators that determine disease phenotypes, a proteomic approach to disease diagnosis would appear to be ideal. Proteomics, with its high-throughput unbiased approach to the analysis of variations in protein expression patterns (actual phenotypic expression of genetic variation), holds enormous promise in capturing the complex molecular response that arises from clinically significant perturbations to the normal state of a physiologic system (i.e., from the healthy state to the diseased state). A number of recent articles have reviewed the application of proteomics to renal disease biomarker discovery. [78] [79] The foregoing discussion emphasizes the most recent advances in this rapidly moving, albeit nascent field, in the hope of capturing developmental trends, identifying areas of intense activity, and projecting future directions.

TABLE 70-1   -- Characteristics of an Ideal Biomarker for Acute Tubular Injury

Easy to measure, preferably at bedside

Detectable in urine or blood and stable in these fluids







Ideally made only by kidney



Better if it is produced by proximal tubule rather than filtered by glomerulus and reabsorbed by proximal tubule

Distinguishes tubular from prerenal, postrenal, and glomerular injury

Can be detected early in the course of acute tubular injury before creatinine and BUN increase and casts appear in the urine

Useful to monitor severity and progression or regression of injury and track the natural history of the injury

Predicts outcome

Predicts therapeutic response

Can be used in both preclinical and clinical studies

Acceptable by the FDA as a predictor of a clinical outcome

Understandable function of the marker in the kidney


BUN, blood urea nitrogen; FDA, Food and Drug administration.




Diagnostic Expression Profiling in Renal Cell Carcinoma

One of the most obvious applications of proteomics to disease-biomarker discovery is in expression profiling. Quantitative imaging of spots in 2DE and MS ion intensity measurements after radioisotope or chemical labeling, allow comparison of protein abundances between samples, such as between disease and normal control. This approach has been used extensively in the search for potential markers of renal cell carcinoma, wherein clinical tissue samples are often obtained from biopsy, consisting largely of malignant sections with surrounding normal tissues. Most studies using 2DE have generally shown a pattern of altered expression consistent with the Warburg effect—an increased glycolytic flux at the expense of gluconeogenic reactions. [80] [81] [82] Follow-up studies have focused on two proteins earlier identified to have the most altered expression: agmatinase and ketohexokinase, validating them at both transcript and protein level. [83] [84] [85] In addition to showing their down-regulation in renal cell carcinoma (RCC), agmatinase was localized to the mitochondria and its disease-association linked to putative roles in polyamine biosynthesis and nitric oxide (NO) generation,[83] whereas ketohexokinase activity was correlated with tumor progression stages.[85] Sarto and colleagues[86] reported modifications of heat shock protein 27 (HSP–27) in RCC, appearing as changes in the number, position, and intensity of 2DE spots when compared against normal control. Using MS/MS, they subsequently characterized HSP–27 post-translational modifications, in particular phosphorylation at positions S82 and (less definitely) S15, as contributing to these differential spot distributions.[87]

Klade and co-workers[88] and Kellner and co-workers[89] used a variant approach of 2DE in which spots obtained from the 2DE analysis of tumor and normal tissues, reacting positively against antibodies sourced from both autologous and allogeneic sera of patients and normal volunteers, were identified by Edman degradation. This analysis demonstrated the presence of smooth muscle 22-α and carbonic anhydrase I as antigenic species in RCC that are not detected in normal control.[88] Among other proteins reported to have shown altered expression in RCC,[89] only vimentin, a cytoskeletal protein expressed by mesanchymal cells, was reported in common in two other independent studies. [81] [82] This highlights one potential difficulty in proteomics biomarker discovery, namely reproducibility even across closely similar techniques.

A potential source of such variability is tissue sample preparation. In particular, a major concern involves cross-contamination between cancer and normal cells, largely due to tissue microheterogeneity. An important technology increasingly used to address this concern is laser-capture microdissection, a high-precision method to delineate and separate tumor cells from normal cells, as demonstrated by Banks and co-workers[90] in their 2DE analysis of RCC. Poznanovic and co-workers[91] used a related technique, laser microdissection, and pressure catapulting, which when combined with a highly sensitive analytical technique, radio-iodide labeling DIGE, enabled them to achieve a high-confidence PMF identification of approximately 30 differentially expressed proteins in RCC from microgram quantities of starting tissue samples. In addition to laser-capture microdessication, Sarto and co-workers[92]employed magnetic microbeads conjugated with epithelial-cell specific antibodies to improve sample homogeneity and consistency in the enriched samples.

Diagnostic Expression Profiling in Human Urine

Perhaps owing to the inherent difficulty associated with 2DE urinary proteomics, the application of 2DE to discover disease markers in human urine has been pursued to a lesser degree. Nevertheless, Lafitte and co-workers[93] have shown that it is possible to distinguish between 2DE patterns derived from four representative kidney diseases and from normal samples, even when taking into account the inter- and intraindividual variations observed in the normal controls. Recently, Park and co-workers[94] employed 2DE for the characterization of the urinary proteome in IgA nephropathy (IgAN) and showed that a number of interesting proteins, including transcription factors and regulatory proteins, were expressed at lower abundance in the urine of IgAN patients. Although these proteins await validation and elucidation as to their putative mechanistic roles in disease development or progression, the reported absence of any serum proteins other than the IgA heavy-chain spot train with significantly altered expression in IgAN,[95] indicates a possible localized source for the reported altered urinary protein expression pattern.

Using a direct analysis of urinary proteins purified by strong cation-exchange chromatography and trypsin digestion before reverse-phase LC-MS/MS, Cutillas and co-workers [96] [97] have identified specifically expressed urinary proteins in Dent disease, a renal Fanconi syndrome characterized by a loss of tubular reabsorption function. These identified proteins in the disease sample, mostly chemokines and cytokines, were not detected in normal controls, suggesting their re-uptake in normal renal tubules. By using the isotope-coded affinity tag, the same group was able to run comparison samples (in disease and normal) simultaneously in MS/MS analysis, and therefore to assess semiquantitatively the relative protein expression in each.[98] In addition to the enriched bioactive proteins identified earlier, they found a lower abundance of vitamin and prosthetic group carriers with Fanconi syndrome, pointing to the multiple normal physiologic roles of the proximal tubule in reabsorptive transport functions. Pang and co-workers[99] applied an array of techniques—2DE, 1D LC-MS/MS, and 2D LC-MS/MS—to search for inflammatory signatures in urine; they verified a number of proteins, in particular orosmucoid, which was previously implicated in inflammation.

A recently reported approach that demonstrates increasing applicability to urinary proteomic analysis is capillary electrophoresis with mass spectrometry (CE-MS), pioneered by Weissenger and co-workers. [100] [101] [102] In this technique, migration time of undigested polypeptides obtained by CE is plotted against mass, and using bioinformatics, a two-dimensional pattern not unlike that obtained from 2DE is extracted and analyzed for diagnostic features. A significant advantage of this method over 2DE, aside from speed of analysis, is its ability to probe the low-molecular weight range, commonly impossible to detect in 2DE. Because of the earlier demonstration of its capability to distinguish patterns between urine in representative kidney diseases (e.g., minimal change disease, membranous nephropathy, focal segmental glumerosclerosis, and hemodialysates) and normal controls, [100] [101] [102] the technique has been applied to mine potential biomarkers of IgA nephropathy,[103] renal damage resulting from type II diabetes,[104] and early diabetic nephropathy arising in type I diabetic adolescents.[105] The response of polypeptide patterns in diabetic nephropathy as a result of candesartan treatment (an angiotensin II receptor blocker) was also characterized using this method.[106] Although mainly dependent on pattern-based discrimination of disease from the normal urinary proteome, a recent study has demonstrated the practicability of coupling this method with top-down MS/MS for identification of intact polypeptides,[107] in this case leading to the identification of three albumin fragments (with sizes up to about 9 kDa) as candidate urinary biomarkers of renal disease.

Biomarker Discovery by Surface-Enhanced Laser Desorption/Ionization-Mass Spectrometry

Surface-enhanced laser desorption/ionization-mass spectrometry, or SELDI-MS, is a technique developed largely as an extension of the basic MALDI technology (see Fig. 70-3B ).[108] In SELDI, the solid surface is functionalized with chromatographic (e.g., hydrophopic, cation and anion exchange, or metal affinity) or biologic substrates to enhance the capture of proteins from the applied samples. The strength of this method lies in the rapid and high-throughput analysis of clinical samples, thus lending itself to ease of readout of mass spectra from plasma or urine samples applied directly onto SELDI chips without the need for extensive sample preprocessing. SELDI-MS is a platform that offers potential portability to the clinical setting with the potential to realize the ideal bench-to-bedside promise, which explains the enormous popularity it has gained in clinical proteomics applications.

This technique has been applied to mining for serum markers of RCC, [109] [110] as well as to discriminating between cytologic specimens of clinically related but distinct tumor cells, using only crude cell lysates.[111] More importantly, it has found increasing application in the identification of urinary biomarkers for renal allograft rejection, as the following studies demonstrate. Clarke and co-workers[112] correctly classified 91% of 34 renal transplant patients consisting of 17 with acute rejection by using the classification and regression tree algorithm to discriminate between patterns derived from transplant patients with acute rejection and those from stable allograft function, all of whom had undergone allograft biopsy by protocol for histologic diagnosis. Working with a slightly larger sample set consisting of 18 with acute rejection and 22 with stable allograft function, along with 32 healthy normals and 15 potentially confounding disease controls, Schaub and co-workers[113] were again able to discriminate between patterns shared among controls and stable allografts, and that of acute rejection. O'Riordan and co-workers[114] more recently studied a patient population of comparable size and achieved a greater than 90% accurate classification into acute rejection or stable allograft function. It is notable that although the majority of peaks reported by all three studies as having the highest discriminatory values occur in the 5- to 7-kDa range, none in fact were common to all three, at least not within approximately 70 Da of each other. Khurana and co-workers[58] used SELDI to profile a cross section of pediatric patients with steroid resistant, and steroid sensitive nephrotic syndrome, as well control samples from children with orthostatic proteinuria. A genetic algorithm was used to search for the smallest number of urinary proteins that yielded the maximum separation of steroid resistant and steroid sensitive samples in principal component space. They identified beta-2 microglobulin, a known marker of renal tubular injury as a correlate of steroid resistance.

Conflicting results observed among specific disease-focused, SELDI-based studies has often been a cause for concern over the lack of robustness of the SELDI method. Rogers and co-workers[115] have demonstrated how a neural network trained to discriminate between RCC and non-RCC urinary SELDI patterns showed greatly reduced performance when tested later on a new sample set. The variability in chip quality across manufacturing batches, in laser performance, and more generally, in instrument precision are contributing factors, especially because Schaub and co-workers[116] reported that spectra reproducibility is practically unaffected by long-term storage at -70°C or by freeze-thaw cycles of urine samples. Traum and co-workers[117] showed that the time interval between sample collection and deep freezing does not significantly affect SELDI variability in split urine and plasma samples, and that the observed variability is most likely related to the SELDI platform. On the other hand, all of the classification models derived in the above-mentioned studies on allograft rejection have not been tested on independent validation sets, such that most of their reported accuracy values are not really predictive. A recent study to predict acute kidney injury from urinary SELDI maps reported 100% sensitivity. However, the study was performed on the same sample set (60 patients undergoing cardiopulmonary bypass, of whom 15% went on to experience acute kidney injury) that was analyzed to derive the three or four peaks used for discrimination. One of these peaks belongs to a 66-kDa protein,[118] the identity of which has not been definitively discounted for albumin.

Proteomics in Animal Model Studies

The application of proteomics to animal models of renal disease have yielded promising results both in terms of better understanding of molecular mechanisms of the disease studied, as well as in the discovery of candidate biomarkers with potential human applications. A representative example is Thongboonkerd and co-workers'[119] 2DE study of a mouse model of diabetic nephropathy, showing an increase in monocyte/neutrophil elastase inhibitor and a decrease in elastase IIIB, which they used to correctly predict an increased level of elastin in diabetic kidneys in three human renal biopsies from type I diabetes patients. In another 2DE study of renal diabetic injury, Rosca and co-workers[120] used antibodies against methylglyoxal-modified proteins and identified by MS/MS targets of glycation in the streptozotocin-induced diabetic rat kidney model. These investigators were able to link the majority of identified target proteins belonging to oxidative phosphorylation and fatty-acid oxidation pathways (e.g., components of complex III), to physiologic characteristics of the diabetic renal mitochondria such as higher oxidative stress and lower energetics.

The limited accessibility to human clinical samples has made the use of animal models in biomarker discovery particularly attractive, especially because large sample numbers become critical to the biomarkers' statistical validation. Harnessing this advantage, Voshol and co-workers[121] studied acute renal allograft rejection in a Brown-Norway to Lewis rat kidney allotransplantation model, using SELDI and 2DE analysis of both serum and urine. Because of the predictability of the disease progression in this animal model (5 days before advanced tissue damage), the study was able to evaluate the sensitivity of candidate biomarkers at different stages of progression among allograft rejecters. Using an autosomal recessive polycystic kidney disease mouse model derived from a double mutation in the Nek8 gene, Valkova and co-workers[122] have found overexpression of galectin–1, sorcin, vimentin, and major urinary proteins as possibly contributing to cystogenesis through a mechano-sensing and osmo-sensing dysfunction in PKD tubular epithelial cells. Through a two-kidney, one-clip model of renovascular hypertension in the Lewis rat, Pinet and co-workers[123] have shown that loss of troponin-T is a marker of differentiation of smooth muscle cells to myoepitheloid cells as a result of stenosis and high renin levels in the hypertensive kidney.

Other proteomics studies using animal models to elucidate disease and biologic processes in the kidney include demonstrating a role for compensatory mechanisms when various major reabsorption pathways for Na+ and water transport in the proximal tubule have been ablated[124]; a relative abundance of ion transporters in the premacula densa of the obese Zucker rat model of type II diabetes[125]; the effects of aging in nonmitochondrial fractions[126]; and the effects of dietary phosphorus restriction in proximal tubules.[127]

Another area that has benefited from the use of animal models is in renal toxicity studies. The use of proteomics promises a better mechanistic understanding and better biomarkers of toxicity arising from drugs, chemicals, and other environmental insults to the kidney. A few such toxicants that have been studied by proteomics are 4-aminophenol, D-serine (128), puromycin aminonucleoside,[129] nitric oxide (DETA-NO),[130] JP–8 jet fuel, [131] [132] lead, [133] [134] aldosterone, [135] [136] Ioxilan (radiocontrast agent),[137] and gentamicin aminoglycoside.[138] This rapidly advancing area under the purview of toxicogenomics has been covered extensively in a recent review.[139]


Completing the Proteome Map of the Human Kidney and Urine

We have previously identified several critical developments that must take place to fully realize the more focused goal of proteomics as a clinical diagnostic tool.[79] These same developmental milestones apply in general to the proteomics of the renal system, whether in health or disease. We have earlier discussed the task of completing a reference renal proteomics map and the difficulty inherent in such a task given the cellular diversity and regional heterogeneity represented in the kidney. Considering the size of the kidney transcriptome, with genes numbering in the tens of thousands,[140] the successful identification of several hundreds of proteins in the kidney proteome to date may be considered a modest achievement indeed. The continuing effort to map the proteome of individual functional units (e.g., glomerulus and the kidney cortex) in the kidney will have to be extended using the best in purification techniques to ensure homogeneous samples in order to reduce the complexity of the total kidney proteome (divide and conquer), while also improving on our insight into these units' specialized functions and differentiation processes.

Another impetus for these proteome-cataloging studies of the kidney derives from a tangential source: the increasing interest in urine as a diagnostic fluid for many diseases including even those not directly kidney associated. It is often asked which and what proportion of the proteins identified in urine are derived directly from the kidneys, and which are contributed mainly by plasma filtration. Knowing the proteins' source within the kidney would be a valuable guide in the search for candidate urinary markers of renal diseases, in cases in which the disease localization is well characterized (see for instance Table 70-1 on criteria for a good biomarker). The inverse is also true in that results from proteomics studies in urine could guide a more focused proteomics investigation of renal function. An example of this is Pisitkun and co-workers'[135] exosome profiling in urine, which they followed up with a large-scale MS/MS identification of proteins associated with aquaporin-2 vesicular bodies in the inner medullary collecting duct, thus elucidating the potential mechanism of transport of these vesicles to the apical membrane from where they are subsequently released to the urine.[141]

Mapping the entire urinary proteome itself continues to be an enormous challenge. This objective will surely benefit from a large-scale, multilaboratory, multiplatform collaboration such as that formed in the context of the human plasma proteome project.[142] As discussed earlier, no single platform promises to sufficiently cover the urine's dynamic range of protein abundance. Affinity capture by ligand-libraries, used with impressive results in the characterization of plasma samples, has been less than stellar when applied to urine.[76] This is probably due to the proteolytic processes occurring in the renal tubules, which tends to increase the diversity of polypeptide fragments in the urine relative to plasma, especially in the low-molecular mass range.[1] The most recent tally on the total number of proteins identified in human urine stands at approximately 800.[76] A rigorous analysis of the results of the human proteome organization (HUPO) plasma proteome project places the estimate of unique gene products present in the human plasma between approximately 900 and 3000.[143] The last figures provide a rough gauge as to the extent of the “hidden” urinary proteome that waits unveiling.

Advancing Proteomics to the Renal Diagnostics Arena

The vision of translating our knowledge of the human genome and proteome to real health benefits depends in part on our ability to bring current technology into the arena of clinically relevant diagnostic tools. In this context, questions of robustness and biomarker validation become the central issues. Three major approaches have emerged as proteomics studies have begun to address the goal of translational medicine. In one, proteomics is delegated the task of a high-throughput screen from which candidate biomarkers are vetted for further validation using immunoassay methods such as enzyme-linked immunoassay (see for instance, the proposed algorithm of Skates and Iliopoulos[144]). We have seen this approach applied in a number of 2DE studies on renal carcinoma that were followed up by more thorough validation of best-candidate markers at both protein and transcript levels using well-established methods. [83] [85] This approach is predicted to more rapidly usher candidate biomarkers into clinical-phase trials, because it does not fundamentally depart from standard practice in the conduct of clinical trials of investigational drugs and diagnostic tests. This approach, however, poses a potentially high risk of failure when applied to large sample populations, because the one-marker paradigm is based on the assumption of a single disease phenotype at the molecular level or at least a common pathway.

A way of mitigating this risk is to consider multiple best-candidate markers simultaneously, and to “cluster” them as a multicomponent biomarker rather than discretely. This intermediate approach of having an array of markers promises to capture a much wider spectrum of molecular phenotypes as might be expected in complex diseases. What is then needed is an additional proteomics validation step that is more focused on a selected group of biomarkers, with the purpose of defining the minimum number of biomarkers required (i.e., the number of components) to meet sensitivity and specificity criteria. This can be done on a more quantitative proteomics platform with medium-throughput capability. One such platform is flow cytometry combined with immunostaining applied, for example, to known cancer antigens in various renal cell lines.[145] Another such platform is immunoblotting 2DE, demonstrated in expression analyses of major histocompatibility complex class I antigen processing and presentation pathway proteins[146] and heat shock proteins[147] in RCC cell lines treated with interferon-g. The combination of mass spectrometry (MALDI-TOF) and antibody-based affinity capture may also be useful where protein variants arising from proteolysis or PTMs constitute the markers of interest. [148] [149] Indeed, rapid developments in MS, especially in mass accuracy and range, would enable the routine capture of such complex information as contained in PTMs, as has recently been demonstrated in a single amino-acid resolution MS/MS mapping of IgA O-glycan structure relevant to IgA nephropathy.[150]

The third approach, representing the other end of the spectrum, is the use of patterns derived from the application of proteomics techniques (MS in particular) as biomarkers.[151] Examples of this abound especially in the use of SELDI-MS, although another technique, CE-MS, has recently shown enormous potential especially in urinary proteomics expression profiling. The pattern approach relies on bioinformatics methods such as statistical and machine-learning algorithms to mine features in the raw data, such as peaks, spots, or clusters of these, containing the diagnostic information of interest. A number of issues have been raised concerning this approach, chief among them the issue of robustness,[77] largely as the result of lack of rigorous validation, especially with an independent validation set of a sufficient size relevant to the study population. The inherent difficulty with the pattern approach is the large number of variables admitted by the analysis, the majority of which may be irrelevant to the task of classification at hand, thus increasing the likelihood that spurious features arising from the pattern may bias the conclusions. The instrument's low mass accuracy and precision, as well as the poor depth of coverage, has the potential to compound this likelihood of bias. The latter is due to the large variability in what may be sampled from the proteome. This has been a serious limiting factor for the SELDI platform with its relatively less efficient protein capture as compared, for example, with bead-based affinity capture. Nevertheless, with its asset of direct portability to the clinical setting (relieved of the cumbersome requirement of having to identify each diagnostic feature) ensuring it a place in diagnostic research, the pattern approach will likely continue to mature with improvements in MS instruments and data analysis, and through application in larger population study cohorts in order to produce more statistically rigorous studies.

From Parts Catalog and Biomarkers to Systems Biology

In order for our understanding of disease mechanism to achieve a revolutionary leap, proteomics needs to transcend the basic tasks of completing the catalog or map of expressed proteins, as well as the routine profiling of their expression in disease and other biologic processes in the search for clinically relevant biomarkers ( Fig. 70-5 ). What is needed is to build from our knowledge of the parts list the networks and pathways that govern biologic process in what is termed as systems biology.[152] This direction is beginning to emerge in representative studies that apply proteomics to characterize changes in a subset or subsets of the proteome in order to relate known protein interaction networks to mechanisms of health, disease, drug response, and toxicity. For example, the antiapoptotic role of HSP27 in response to cystein-conjugate nephrotoxic injury and the identification of modulators in this antiapoptotic pathway was recently studied by DIGE in a model cell line LLC-PK1.[153] In another example, the stress-protection and survival mechanism involved in glucocorticoid treatment was elucidated by a 2DE proteomics profiling of cultured murine podocytes.[154] The homogeneity, well-characterized phenotype, and amenability to controlled perturbations of cell cultures and cell lines make them ideal model systems for such studies. The significance of the clinical inference derived from such studies is made more promising by recent findings that primary normal kidney epithelial and RCC cell cultures retain the proteomic profiles of their source tissues.[155] Thus, in model-cell proteomics, we envision one more tool to dissect in even finer and more exquisite detail the cellular and subcellular processes that define renal health and disease.

FIGURE 70-5  Emerging development paths or approaches in the clinical application of renal disease biomarkers. Throughput refers to the number of markers used in the analysis of samples, as represented by the technologies listed on the right side. The paths may be classified as (1) high-throughput screening with best-candidate marker validation, (2) multimarker validation, and (3) a purely pattern-based approach. CE-MS, capillary electrophoresis-mass spectrometry. DIGE, differential imaging gel electrophoresis. PROTEOMEX, combined proteomics with serologic expression cloning.



1. Lander ES, Linton LM, Birren B, et al: Initial sequencing and analysis of the human genome.  Nature  2001; 409:860-921.

2. Dzau VJ, Austin MJ, Brown P, et al: Revolution and renaissance.  Physiol Genomics  1999; 1:1-2.

3. Soutourina O, Cheval L, Doucet A: Global analysis of gene expression in mammalian kidney.  Pflugers Arch  2005; 450:13-25.

4. Kaimori JY, Takenaka M, Okubo K: Quantification of gene expression in mouse and human renal proximal tubules.  Methods Mol Biol  2005; 293:209-219.

5. Dressler GR: The cellular basis of kidney development.  Annu Rev Cell Dev Biol  2006; 22:509-529.

6. Boheler KR, Tarasov KV: SAGE analysis to identify embryonic stem cell-predominant transcripts.  Methods Mol Biol  2006; 329:195-221.

7. Brooks HL, Ageloff S, Kwon TH, et al: cDNA array identification of genes regulated in rat renal medulla in response to vasopressin infusion.  Am J Physiol Renal Physiol  2003; 284:F218-F228.

8. Gumz ML, Popp MP, Wingo CS, Cain BD: Early transcriptional effects of aldosterone in a mouse inner medullary collecting duct cell line.  Am J Physiol Renal Physiol  2003; 285:F664-F673.

9. Gomez-Sanchez EP: Brain mineralocorticoid receptors: orchestrators of hypertension and end-organ disease.  Curr Opin Nephrol Hypertens  2004; 13:191-196.

10. Firsov D: Revisiting sodium and water reabsorption with functional genomics tools.  Curr Opin Nephrol Hypertens  2004; 13:59-65.

11. Sarwal M, Chua MS, Kambham N, et al: Molecular heterogeneity in acute renal allograft rejection identified by DNA microarray profiling.  N Engl J Med  2003; 349:125-138.

12. Chua MS, Mansfield E, Sarwal M: Applications of microarrays to renal transplantation: Progress and possibilities.  Front Biosci  2003; 8:s913-s923.

13. Hotchkiss H, Chu TT, Hancock WW, et al: Differential expression of profibrotic and growth factors in chronic allograft nephropathy.  Transplantation  2006; 81:342-349.

14. Ding R, Medeiros M, Dadhania D, et al: Noninvasive diagnosis of BK virus nephritis by measurement of messenger RNA for BK virus VP1 in urine.  Transplantation  2002; 74:987-994.

15. Li B, Hartono C, Ding R, et al: Noninvasive diagnosis of renal-allograft rejection by measurement of messenger RNA for perforin and granzyme B in urine.  N Engl J Med  2001; 344:947-954.

16. Medeiros M, Sharma VK, Ding R, et al: Optimization of RNA yield, purity and mRNA copy number by treatment of urine cell pellets with RNAlater.  J Immunol Methods  2003; 279:135-142.

17. Muthukumar T, Dadhania D, Ding R, et al: Messenger RNA for FOXP3 in the urine of renal-allograft recipients.  N Engl J Med  2005; 353:2342-2351.

18. Tatapudi RR, Muthukumar T, Dadhania D, et al: Noninvasive detection of renal allograft inflammation by measurements of mRNA for IP-10 and CXCR3 in urine.  Kidney Int  2004; 65:2390-2397.

19. Zhang Q, Reed EF: Array-based methods for diagnosis and prevention of transplant rejection.  Expert Rev Mol Diagn  2006; 6:165-178.

20. Fielden MR, Eynon BP, Natsoulis G, et al: A gene expression signature that predicts the future onset of drug-induced renal tubular toxicity.  Toxicol Pathol  2005; 33:675-683.

21. Merrick BA, Bruno ME: Genomic and proteomic profiling for biomarkers and signature profiles of toxicity.  Curr Opin Mol Ther  2004; 6:600-607.

22. Hubank M, Schatz DG: Identifying differences in mRNA expression by representational difference analysis of cDNA.  Nucl Acid Res  1994; 22:5640-5648.

23. Ichimura T, Bonventre JV, Bailly V, et al: Kidney injury molecule-1 (KIM-1), a putative epithelial cell adhesion molecule containing a novel immunoglobulin domain, is up-regulated in renal cells after injury.  J Biol Chem  1998; 273:4135-4142.

24. Ichimura T, Hung CC, Yang SA, et al: Kidney injury molecule-1: a tissue and urinary biomarker for nephrotoxicant-induced renal injury.  Am J Physiol Renal Physiol  2004; 286:F552-F563.

25. Bailly V, Zhang Z, Meier W, et al: Shedding of kidney injury molecule-1, a putative adhesion protein involved in renal regeneration.  J Biol Chem  2002; 277:39739-39748.

26. Han WK, Bailly V, Abichandani R, et al: Kidney Injury Molecule-1 (KIM-1): A novel biomarker for human renal proximal tubule injury.  Kidney Int  2002; 62:237-244.

27. Vaidya VS, Bonventre JV: Mechanistic biomarkers for cytotoxic acute kidney injury.  Expert Opin Drug Metab Toxicol  2006; 2:697-713.

28. Amin RP, Vickers AE, Sistare F, et al: Identification of putative gene based markers of renal toxicity.  Environ Health Perspect  2004; 112:465-479.

29. Hong ME, Hong JC, Stepkowski S, Kahan BD: Correlation between cyclosporine-induced nephrotoxicity in reduced nephron mass and expression of kidney injury molecule-1 and aquaporin-2 gene.  Transplant Proc  2005; 37:4254-4258.

30. Perez-Rojas J, Blanco JA, Cruz C, et al: Mineralocorticoid receptor blockade confers renoprotection in preexisting chronic cyclosporine nephrotoxicity.  Am J Physiol Renal Physiol  2007; 292:F131-F139.

31. van Timmeren MM, Bakker SJ, Vaidya VS, et al: Tubular kidney injury molecule-1 in protein-overload nephropathy.  Am J Physiol Renal Physiol  2006; 291:F456-F464.

32. De Borst MH, van Timmeren MM, Vaidya VS, et al: Induction of kidney injury molecule-1 (Kim-1) in homozygous Ren2 rats is attenuated by blockade of the rennin-angiotensin system or p38 MAP kinase.  Am J Physiol Renal Physiol  2007; 292:F313-F320.

33. Tsuji M, Monkawa T, Yoshino J, et al: Microarray analysis of a reversible model and an irreversible model of anti-Thy-1 nephritis.  Kidney Int  2006; 69:996-1004.

34. Schuurs TA, Gerbens F, van der Hoeven JA, et al: Distinct transcriptional changes in donor kidneys upon brain death induction in rats: insights in the processes of brain death.  Am J Transplant  2004; 4:1972-1981.

35. Kuehn EW, Park KM, Somlo S, Bonventre JV: Kidney injury molecule-1 expression in murine polycystic kidney disease.  Am J Physiol Renal Physiol  2002; 283:F1326-F1336.

36. Thukral SK, Nordone PJ, Hu R, et al: Prediction of nephrotoxicant action and identification of candidate toxicity-related biomarkers.  Toxicol Pathol  2005; 33:343-355.

37. Basile DP, Fredrich K, Alausa M, et al: Identification of persistently altered gene expression in the kidney after functional recovery from ischemic acute renal failure.  Am J Physiol Renal Physiol  2005; 288:F953-F963.

38. Devarajan P, Mishra J, Supavekin S, et al: Gene expression in early ischemic renal injury: clues towards pathogenesis, biomarker discovery, and novel therapeutics.  Mol Genet Metab  2003; 80:365-376.

39. Kieran NE, Doran PP, Connolly SB, et al: Modification of the transcriptomic response to renal ischemia/reperfusion injury by lipoxin analog.  Kidney Int  2003; 64:480-492.

40. Safirstein RL: Acute renal failure: From renal physiology to the renal transcriptome.  Kidney Int  2004;S62-S66.

41. Supavekin S, Zhang W, Kucherlapati R, et al: Differential gene expression following early renal ischemia/reperfusion.  Kidney Int  2003; 63:1714-1724.

42. Yoshida T, Kurella M, Beato F, et al: Monitoring changes in gene expression in renal ischemia-reperfusion in the rat.  Kidney Int  2002; 61:1646-1654.

43. Yuen PS, Jo SK, Holly MK, et al: Ischemic and nephrotoxic acute renal failure are distinguished by their broad transcriptomic responses.  Physiol Genomics  2006; 25:375-386.

44. Kohane IS, Kho AT, Butte A: Microarrays for an integrative genomics,  Cambridge, MA, MIT Press, 2002.

45. Hwang KB, Kong SW, Greenberg SA, Park PJ: Combining gene expression data from different generations of oligonucleotide arrays.  BMC Bioinformatics  2004; 5:159.

46. Bammler T, Beyer RP, Bhattacharya S, et al: Standardizing global gene expression analysis between laboratories and across platforms.  Nat Methods  2005; 2:351-356.

47. Irizarry RA, Warren D, Spencer F, et al: Multiple-laboratory comparison of microarray platforms.  Nat Methods  2005; 2:345-350.

48. Shi L, Reid LH, Jones WD, et al: The MicroArray Quality Control (MAQC project shows inter- and intraplatform reproducibility of gene expression measurements.  Nat Biotechnol  2006; 24:1151-1161.

49. Butte AJ, Kohane IS: Creation and implications of a phenome-genome network.  Nat Biotechnol  2006; 24:55-62.

50. Kim RD, Park PJ: Improving identification of differentially expressed genes in microarray studies using information from public databases.  Genome Biol  2004; 5:R70.

51. Bicciato S: Artificial neural network technologies to identify biomarkers for therapeutic intervention.  Curr Opin Mol Ther  2004; 6:616-623.

52. Byvatov E, Schneider G: Support vector machine applications in bioinformatics.  Appl Bioinformatics  2003; 2:67-77.

53. Paul TK, Iba H: Gene selection for classification of cancers using probabilistic model building genetic algorithm.  Biosystems  2005; 82:208-225.

54. Kho AT, Zhao Q, Cai Z, et al: Conserved mechanisms across development and tumorigenesis revealed by a mouse development perspective of human cancers.  Genes Dev  2004; 18:629-640.

55. Kuruvilla FG, Park PJ, Schreiber SL: Vector algebra in the analysis of genome-wide expression data.  Genome Biol  2002; 3:RESEARCH0011

56. Ramoni MF, Sebastiani P, Kohane IS: Cluster analysis of gene expression dynamics.  Proc Natl Acad Sci U S A  2002; 99:9121-9126.

57. Butte AJ, Kohane IS: Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements.  Pac Symp Biocomput  2000;418-429.

58. Khurana M, Traum AZ, Aivado M, et al: Urine proteomic profiling of pediatric nephrotic syndrome.  Pediatr Nephrol  2006; 21:1257-1265.

59. Lin TH, Chiu SH, Tsai KC: Supervised feature ranking using a genetic algorithm optimized artificial neural network.  J Chem Inf Model  2006; 46:1604-1614.

60. Li L, Jiang W, Li X, et al: A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset.  Genomics  2005; 85:16-23.

61.   R Development Core Team: R: A language and environment for statistical computing. Vienna, Austria, R Foundation for Statistical Computing, 2006.

62. Magni F, Sarto C, Valsecchi C, et al: Expanding the proteome two-dimensional gel electrophoresis reference map of human renal cortex by peptide mass fingerprinting.  Proteomics  2005; 5:816-825.

63. Sarto C, Marocchi A, Sanchez JC, et al: Renal cell carcinoma and normal kidney protein expression.  Electrophoresis  1997; 18:599-604.

64. Yoshida Y, Miyazaki K, Kamiie J, et al: Two-dimensional electrophoretic profiling of normal human kidney glomerulus proteome and construction of an extensible markup language (XML)-based database.  Proteomics  2005; 5:1083-1096.

65. Witzmann FA, Fultz CD, Grant RA, et al: Differential expression of cytosolic proteins in the rat kidney cortex and medulla: Preliminary proteomics.  Electrophoresis  1998; 19:2491-2497.

66. Arthur JM, Thongboonkerd V, Scherzer JA, et al: Differential expression of proteins in renal cortex and medulla: A proteomic approach.  Kidney Int  2002; 62:1314-1321.

67. Hoffert JD, van Balkom BW, Chou CL, Knepper MA: Application of difference gel electrophoresis to the identification of inner medullary collecting duct proteins.  Am J Physiol Renal Physiol  2004; 286:F170-F179.

68. Davis MT, Spahr CS, McGinley MD, et al: Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry. II. Limitations of complex mixture analyses.  Proteomics  2001; 1:108-117.

69. Spahr CS, Davis MT, McGinley MD, et al: Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry. I. Profiling an unfractionated tryptic digest.  Proteomics  2001; 1:93-107.

70. Pieper R, Gatlin CL, McGrath AM, et al: Characterization of the human urinary proteome: A method for high-resolution display of urinary proteins on two-dimensional electrophoresis gels with a yield of nearly 1400 distinct protein spots.  Proteomics  2004; 4:1159-1174.

71. Pisitkun T, Shen RF, Knepper MA: Identification and proteomic profiling of exosomes in human urine.  Proc Natl Acad Sci U S A  2004; 101:13368-13373.

72. Oh J, Pyo JH, Jo EH, et al: Establishment of a near-standard two-dimensional human urine proteomic map.  Proteomics  2004; 4:3485-3497.

73. Thongboonkerd V, McLeish KR, Arthur JM, Klein JB: Proteomic analysis of normal human urinary proteins isolated by acetone precipitation or ultracentrifugation.  Kidney Int  2002; 62:1461-1469.

74. Thongboonkerd V, Chutipongtanate S, Kanlaya R: Systematic evaluation of sample preparation methods for gel-based human urinary proteomics: Quantity, quality, and variability.  J Proteome Res  2006; 5:183-191.

75. Smith G, Barratt D, Rowlinson R, et al: Development of a high-throughput method for preparing human urine for two-dimensional electrophoresis.  Proteomics  2005; 5:2315-2318.

76. Castagna A, Cecconi D, Sennels L, et al: Exploring the hidden human urinary proteome via ligand library beads.  J Proteome Res  2005; 4:1917-1930.

77. Baker M: In biomarkers we trust?.  Nat Biotechnol  2005; 23:297-304.

78. Thongboonkerd V: Proteomic analysis of renal diseases: Unraveling the pathophysiology and biomarker discovery.  Expert Rev Proteomics  2005; 2:349-366.

79. Vidal BC, Bonventre JV, Hsu SIH: Towards the application of proteomics in renal disease diagnosis.  Clin Sci (Lond)  2005; 109:421-430.

80. Balabanov S, Zimmermann U, Protzel C, et al: Tumour-related enzyme alterations in the clear cell type of human renal cell carcinoma identified by two-dimensional gel electrophoresis.  Eur J Biochem  2001; 268:5977-5980.

81. Hwa JS, Park HJ, Jung JH, et al: Identification of proteins differentially expressed in the conventional renal cell carcinoma by proteomic analysis.  J Korean Med Sci  2005; 20:450-455.

82. Unwin RD, Craven RA, Harnden P, et al: Proteomic changes in renal cancer and co-ordinate demonstration of both the glycolytic and mitochondrial aspects of the Warburg effect.  Proteomics  2003; 3:1620-1632.

83. Dallmann K, Junker H, Balabanov S, et al: Human agmatinase is diminished in the clear cell type of renal cell carcinoma.  Int J Cancer  2004; 108:342-347.

84. Hall YN, Fuentes EF, Chertow GM, Olson JL: Race/ethnicity and disease severity in IgA nephropathy.  BMC Nephrol  2004; 5:10.

85. Hwa JS, Kim HJ, Goo BM, et al: The expression of ketohexokinase is diminished in human clear cell type of renal cell carcinoma.  Proteomics  2006; 6:1077-1084.

86. Sarto C, Valsecchi C, Magni F, et al: Expression of heat shock protein 27 in human renal cell carcinoma.  Proteomics  2004; 4:2252-2260.

87. Tremolada L, Magni F, Valsecchi C, et al: Characterization of heat shock protein 27 phosphorylation sites in renal cell carcinoma.  Proteomics  2005; 5:788-795.

88. Klade CS, Voss T, Krystek E, et al: Identification of tumor antigens in renal cell carcinoma by serological proteome analysis.  Proteomics  2001; 1:890-898.

89. Kellner R, Lichtenfels R, Atkins D, et al: Targeting of tumor associated antigens in renal cell carcinoma using proteome-based analysis and their clinical significance.  Proteomics  2002; 2:1743-1751.

90. Banks RE, Dunn MJ, Forbes MA, et al: The potential use of laser capture microdissection to selectively obtain distinct populations of cells for proteomic analysis-preliminary findings.  Electrophoresis  1999; 20:689-700.

91. Poznanovic S, Wozny W, Schwall GP, et al: Differential radioactive proteomic analysis of microdissected renal cell carcinoma tissue by 54 cm isoelectric focusing in serial immobilized pH gradient gels.  J Proteome Res  2005; 4:2117-2125.

92. Sarto C, Valsecchi C, Mocarelli P: Renal cell carcinoma: Handling and treatment.  Proteomics  2002; 2:1627-1629.

93. Lafitte D, Dussol B, Andersen S, et al: Optimized preparation of urine samples for two-dimensional electrophoresis and initial application to patient samples.  Clin Biochem  2002; 35:581-589.

94. Park MR, Wang EH, Jin DC, et al: Establishment of a 2-D human urinary proteomic map in IgA nephropathy.  Proteomics  2006; 6:1066-1076.

95. Shuib AS, Chua CT, Hashim OH: Sera of IgA nephropathy patients contain a heterogeneous population of relatively cationic alpha-heavy chains.  Nephron  1998; 78:290-295.

96. Cutillas PR, Norden AG, Cramer R, et al: Detection and analysis of urinary peptides by on-line liquid chromatography and mass spectrometry: Application to patients with renal Fanconi syndrome.  Clin Sci (Lond)  2003; 104:483-490.

97. Cutillas PR, Norden AG, Cramer R, et al: Urinary proteomics of renal Fanconi syndrome.  Contrib Nephrol  2004; 141:155-169.

98. Cutillas PR, Chalkley RJ, Hansen KC, et al: The urinary proteome in Fanconi syndrome implies specificity in the reabsorption of proteins by renal proximal tubule cells.  Am J Physiol Renal Physiol  2004; 287:F353-F364.

99. Pang JX, Ginanni N, Dongre AR, et al: Biomarker discovery in urine by proteomics.  J Proteome Res  2002; 1:161-169.

100. Kaiser T, Hermann A, Kielstein JT, et al: Capillary electrophoresis coupled to mass spectrometry to establish polypeptide patterns in dialysis fluids.  J Chromatogr A  2003; 1013:157-171.

101. Weissinger EM, Wittke S, Kaiser T, et al: Proteomic patterns established with capillary electrophoresis and mass spectrometry for diagnostic purposes.  Kidney Int  2004; 65:2426-2434.

102. Wittke S, Fliser D, Haubitz M, et al: Determination of peptides and proteins in human urine with capillary electrophoresis-mass spectrometry, a suitable tool for the establishment of new diagnostic markers.  J Chromatogr A  2003; 1013:173-181.

103. Haubitz M, Wittke S, Weissinger EM, et al: Urine protein patterns can serve as diagnostic tools in patients with IgA nephropathy.  Kidney Int  2005; 67:2313-2320.

104. Mischak H, Kaiser T, Walden M, et al: Proteomic analysis for the assessment of diabetic renal damage in humans.  Clin Sci (Lond)  2004; 107:485-495.

105. Meier M, Kaiser T, Herrmann A, et al: Identification of urinary protein pattern in type 1 diabetic adolescents with early diabetic nephropathy by a novel combined proteome analysis.  J Diabetes Complications  2005; 19:223-232.

106. Rossing K, Mischak H, Parving HH, et al: Impact of diabetic nephropathy and angiotensin II receptor blockade on urinary polypeptide patterns.  Kidney Int  2005; 68:193-205.

107. Chalmers MJ, Mackay CL, Hendrickson CL, et al: Combined top-down and bottom-up mass spectrometric approach to characterization of biomarkers for renal disease.  Anal Chem  2005; 77:7163-7171.

108. Chapman K: The ProteinChip Biomarker System from Ciphergen Biosystems: A novel proteomics platform for rapid biomarker discovery and validation.  Biochem Soc Trans  2002; 30:82-87.

109. Tolson J, Bogumil R, Brunst E, et al: Serum protein profiling by SELDI mass spectrometry: Detection of multiple variants of serum amyloid alpha in renal cancer patients.  Lab Invest  2004; 84:845-856.

110. Won Y, Song HJ, Kang TW, et al: Pattern analysis of serum proteome distinguishes renal cell carcinoma from other urologic diseases and healthy persons.  Proteomics  2003; 3:2310-2316.

111. Fetsch PA, Simone NL, Bryant-Greenwood PK, et al: Proteomic evaluation of archival cytologic material using SELDI affinity mass spectrometry: Potential for diagnostic applications.  Am J Clin Pathol  2002; 118:870-876.

112. Clarke W, Silverman BC, Zhang Z, et al: Characterization of renal allograft rejection by urinary proteomic analysis.  Ann Surg  2003; 237:660-664.discussion 664-665

113. Schaub S, Rush D, Wilkins J, et al: Proteomic-based detection of urine proteins associated with acute renal allograft rejection.  J Am Soc Nephrol  2004; 15:219-227.

114. O'Riordan E, Orlova TN, Mei JJ, et al: Bioinformatic analysis of the urine proteome of acute allograft rejection.  J Am Soc Nephrol  2004; 15:3240-3248.

115. Rogers MA, Clarke P, Noble J, et al: Proteomic profiling of urinary proteins in renal cancer by surface enhanced laser desorption ionization and neural-network analysis: Identification of key issues affecting potential clinical utility.  Cancer Res  2003; 63:6971-6983.

116. Schaub S, Wilkins J, Weiler T, et al: Urine protein profiling with surface-enhanced laser-desorption/ionization time-of-flight mass spectrometry.  Kidney Int  2004; 65:323-332.

117. Traum AZ, Wells MP, Aivado M, et al: SELDI-TOF MS of quadruplicate urine and serum samples to evaluate changes related to storage conditions.  Proteomics  2006; 6:1676-1680.

118. Nguyen MT, Ross GF, Dent CL, Devarajan P: Early prediction of acute renal injury using urinary proteomics.  Am J Nephrol  2005; 25:318-326.

119. Thongboonkerd V, Barati MT, McLeish KR, et al: Alterations in the renal elastin-elastase system in type 1 diabetic nephropathy identified by proteomic analysis.  J Am Soc Nephrol  2004; 15:650-662.

120. Rosca MG, Mustata TG, Kinter MT, et al: Glycation of mitochondrial proteins from diabetic rat kidney is associated with excess superoxide formation.  Am J Physiol Renal Physiol  2005; 289:F420-F430.

121. Voshol H, Brendlen N, Muller D, et al: Evaluation of biomarker discovery approaches to detect protein biomarkers of acute renal allograft rejection.  J Proteome Res  2005; 4:1192-1199.

122. Valkova N, Yunis R, Mak SK, et al: Nek8 mutation causes overexpression of galectin-1, sorcin, and vimentin and accumulation of the major urinary protein in renal cysts of jck mice.  Mol Cell Proteomics  2005; 4:1009-1018.

123. Pinet F, Poirier F, Fuchs S, et al: Troponin T as a marker of differentiation revealed by proteomic analysis in renal arterioles.  Faseb J  2004; 18:585-586.

124. Brooks HL, Sorensen AM, Terris J, et al: Profiling of renal tubule Na+ transporter abundances in NHE3 and NCC null mice using targeted proteomics.  J Physiol  2001; 530:359-366.

125. Bickel CA, Knepper MA, Verbalis JG, Ecelbarger CA: Dysregulation of renal salt and water transport proteins in diabetic Zucker rats.  Kidney Int  2002; 61:2099-2110.

126. Kim CH, Park DU, Chung AS, et al: Proteomic analysis of post-mitochondrial fractions of young and old rat kidney.  Exp Gerontol  2004; 39:1155-1168.

127. Cheung PY, Lai WP, Lau HY, et al: Acute and chronic effect of dietary phosphorus restriction on protein expression in young rat renal proximal tubules.  Proteomics  2002; 2:1211-1219.

128. Bandara LR, Kelly MD, Lock EA, Kennedy S: A potential biomarker of kidney damage identified by proteomics: Preliminary findings.  Biomarkers  2003; 8:272-286.

129. Cutler P, Bell DJ, Birrell HC, et al: An integrated proteomic approach to studying glomerular nephrotoxicity.  Electrophoresis  1999; 20:3647-3658.

130. Keller T, Pleskova M, McDonald MC, et al: Identification of manganese superoxide dismutase as a NO-regulated gene in rat glomerular mesangial cells by 2D gel electrophoresis.  Nitric Oxide  2003; 9:183-193.

131. Witzmann FA, Bauer MD, Fieno AM, et al: Proteomic analysis of the renal effects of simulated occupational jet fuel exposure.  Electrophoresis  2000; 21:976-984.

132. Witzmann FA, Carpenter RL, Ritchie GD, et al: Toxicity of chemical mixtures: Proteomic analysis of persisting liver and kidney protein alterations induced by repeated exposure of rats to JP-8 jet fuel vapor.  Electrophoresis  2000; 21:2138-2147.

133. Kanitz MH, Witzmann FA, Zhu H, et al: Alterations in rabbit kidney protein expression following lead exposure as analyzed by two-dimensional gel electrophoresis.  Electrophoresis  1999; 20:2977-2985.

134. Witzmann FA, Fultz CD, Grant RA, et al: Regional protein alterations in rat kidneys induced by lead exposure.  Electrophoresis  1999; 20:943-951.

135. Knepper MA, Kim GH, Masilamani S: Renal tubule sodium transporter abundance profiling in rat kidney: Response to aldosterone and variations in NaCl intake.  Ann N Y Acad Sci  2003; 986:562-569.

136. Wang XY, Masilamani S, Nielsen J, et al: The renal thiazide-sensitive Na-Cl cotransporter as mediator of the aldosterone-escape phenomenon.  J Clin Invest  2001; 108:215-222.

137. Hampel DJ, Sansome C, Sha M, et al: Toward proteomics in uroscopy: Urinary protein profiles after radiocontrast medium administration.  J Am Soc Nephrol  2001; 12:1026-1035.

138. Charlwood J, Skehel JM, King N, et al: Proteomic analysis of rat kidney cortex following treatment with gentamicin.  J Proteome Res  2002; 1:73-82.

139. Witzmann FA, Li J: Proteomics and nephrotoxicity.  Contrib Nephrol  2004; 141:104-123.

140. Chabardes-Garonne D, Mejean A, Aude JC, et al: A panoramic view of gene expression in the human kidney.  Proc Natl Acad Sci U S A  2003; 100:13710-13715.

141. Barile M, Pisitkun T, Yu MJ, et al: Large scale protein identification in intracellular aquaporin-2 vesicles from renal inner medullary collecting duct.  Mol Cell Proteomics  2005; 4:1095-1106.

142. Omenn GS, States DJ, Adamski M, et al: Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database.  Proteomics  2005; 5:3226-3245.

143. States DJ, Omenn GS, Blackwell TW, et al: Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study.  Nat Biotechnol  2006; 24:333-338.

144. Skates S, Iliopoulos O: Molecular markers for early detection of renal carcinoma: Investigative approach.  Clin Cancer Res  2004; 10:6296S-6301S.

145. Li G, Passebosc-Faure K, Lambert C, et al: Flow cytometric analysis of antigen expression in malignant and normal renal cells.  Anticancer Res  2000; 20:2773-2778.

146. Lichtenfels R, Ackermann A, Kellner R, Seliger B: Mapping and expression pattern analysis of key components of the major histocompatibility complex class I antigen processing and presentation pathway in a representative human renal cell carcinoma cell line.  Electrophoresis  2001; 22:1801-1809.

147. Lichtenfels R, Kellner R, Bukur J, et al: Heat shock protein expression and anti-heat shock protein reactivity in renal cell carcinoma.  Proteomics  2002; 2:561-570.

148. Kiernan UA, Tubbs KA, Nedelkov D, et al: Comparative urine protein phenotyping using mass spectrometric immunoassay.  J Proteome Res  2003; 2:191-197.

149. Kiernan UA, Tubbs KA, Nedelkov D, et al: Comparative phenotypic analyses of human plasma and urinary retinol binding protein using mass spectrometric immunoassay.  Biochem Biophys Res Commun  2002; 297:401-405.

150. Renfrow MB, Cooper HJ, Tomana M, et al: Determination of aberrant O-glycosylation in the IgA1 hinge region by electron capture dissociation fourier transform-ion cyclotron resonance mass spectrometry.  J Biol Chem  2005; 280:19136-19145.

151. Gillette MA, Mani DR, Carr SA: Place of pattern in proteomic biomarker discovery.  J Proteome Res  2005; 4:1143-1154.

152. Souchelnytskyi S: Bridging proteomics and systems biology: What are the roads to be traveled?.  Proteomics  2005; 5:4123-4137.

153. de Graauw M, Tijdens I, Cramer R, et al: Heat shock protein 27 is the major differentially phosphorylated protein involved in renal epithelial cellular stress response and controls focal adhesion organization and apoptosis.  J Biol Chem  2005; 280:29885-29898.

154. Ransom RF, Vega-Warner V, Smoyer WE, Klein J: Differential proteomic analysis of proteins induced by glucocorticoids in cultured murine podocytes.  Kidney Int  2005; 67:1275-1285.

155. Perego RA, Bianchi C, Corizzato M, et al: Primary cell cultures arising from normal kidney and renal cell carcinoma retain the proteomic profile of corresponding tissues.  J Proteome Res  2005; 4:1503-1510.

If you find an error or have any questions, please email us at Thank you!